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T.  INTRODUCTION 

Carcinogenesis  is  considered  to  be  a  multistep  process  often  that  takes  place  as  a 
eonsequence  of  exposure  of  cells  to  toxic,  sublethal  concentrations  of  chemieals  capable  of 
eliciting  mutagenesis  by  virtue  of  changes  in  DNA  base  structures.  Historically,  DNA 
modifications  arising  from  cytochrome  P-450-catalyzed  two-electron  oxidation  reactions,  which 
yield  bulky  DNA  base  derivatives,  have  been  regarded  as  significant  in  introducing  mutagenic 
changes  in  DNA  (1-3).  However,  such  reactions  also  yield  another  product,  H2O2,  which  is 
readily  converted  to  ‘OH  in  the  Fe^-mediated  Fenton  reaction  (4,  5).  The  first  evidence  in  a 
living  system  for  the  carcinogenic  potential  of  radical-induced  DNA  oxidation  came  from  studies 
of  the  marine  flatfish  English  sole.  Chronic,  low  level  exposure  to  carcinogenic  xenobiotics  were 
capable  of  eliciting  free  radical  oxidative  DNA  lesions  in  English  sole  inhabiting  contaminated 
waterways  (6).  Such  animals  express  a  high  incidence  of  liver  cancer  (up  to  about  30%) 
compared  to  animals  from  pristine  environments  (7).  A  striking  increase  in  a  range  of  •OH- 
induced  base  lesions  was  found  which  correlated  both  with  exposure  to  xenobiotics  and 
incidence  of  liver  cancer  compared  to  unexposed  control  animals  (8).  Subsequently,  studies  of 
•OH  damage  to  DNA  in  a  variety  of  human  cancers  and  tissues  have  been  conducted  (9, 10). 

The  results  of  these  experiments  agree  closely  with  the  English  sole  data  and  support  the  notion 
that  •OH  induced  DNA  damage  is  intimately  associated  with  carcinogenesis. 

We  have  also  shown  that  substantial  quantities  of  OH-induced  base  modifications  are 
present  in  DNA  from  cancers  of  the  female  breast  (9,  10)  and  that  a  characteristic  pattern  of  such 
base  modifications  occurs  in  the  normal  and  cancer-containing  breast  in  relation  to  the  redox 
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status  of  the  tissue  (9).  Histologically  normal  (reduction  mammoplasty)  tissue  was  characterized 
by  a  high  ratio  of  the  ring-opening  product  4,6-diamino-5-formamidopyrimidine  (Fapyadenine; 
Fapy-A)  to  8 -hydroxy-adducts  (e.g.,  8-hydroxyguanine;  8-OH-Gua).  A  dramatic  change  in  this 
relationship  occurred  favoring  the  substantial  loss  of  ring  opening  products  and  a  substantial 
increase  in  the  8-OH  derivatives  in  the  cancer-containing  breast.  Twenty-two  statistical  base 
models  were  formulated  with  a  high  sensitivity  (e.g.,  91%)  and  specificity  (e.g.,  97%)  for 
classifying  breast  tissue  as  cancer  or  normal  (e.g.,  96%  correct)  (9).  The  dramatic  shift  in  the 
proportions  of  the  •OH-modified  bases  was  attributed  to  changes  in  the  redox  potential  of  the 
breast  that  favor  oxidative  conditions  and  cancer  formation  (9).  The  statistical  models  most 
likely  have  predictive  power  for  assessing  future  breast  cancer  risk.  The  radical-induced  one- 
electron  conversions  of  the  DNA  from  the  normal  and  cancer  containing  breast  often  produced 
one  modification  in  several  hundred  normal  bases.  This  is  exceptionally  high,  notably  relative  to 
two-electron  base  modifications  that  form  bulky  adducts  in  the  range  of  1  in  1 0  to  1  in  1 0 
normal  bases,  as  shown  by  P-postlabeling  analysis  of  diverse  tissues  (11). 

Djuric  et  al-  (12)  reported  that  the  level  of  one  of  the  base  modifications  previously 
described  (5-hydroxymethyluracil;  HMUra)  (9)  is  elevated  in  peripheral  blood  of  patients  with 
breast  cancer.  Frenkel  et  al.  (13)  reported  that  5-hydroxymethyl-2'-deoxyuridine  elicited  elevated 
circulating  autoantibody  levels  reactive  with  this  base  structure  in  patients  with  breast  cancer, 
those  at  high  risk,  and  in  those  in  whom  breast  cancer  developed  one  or  more  years  later.  The 
circulating  5-hydroxymethyl-2'-deoxyuridine  appeared  to  originate  in  breast  tissue.  These 
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findings  support  our  original  hypothesis  regarding  the  potential  use  of  the  •OH-induced  base 
modifications  for  assessing  breast  cancer  risk  (9, 10). 

The  importance  of  the  8-OH-adducts  in  carcinogenesis  is  well  recognized:  8-OH-Gua  has 
been  implicated  as  a  mutagen  and  carcinogen  (14, 15),  has  an  overwhelming  effect  in  causing 
misreplication  in  template-directed  DNA  synthesis  (15),  and  results  in  a  loss  of  methylcytosines 
that  participate  in  regulating  the  expression  of  genes  (16).  However,  the  ring-opening  Fapy 
structures,  as  well  as  bulky  adducts  from  two-electron  oxidations,  were  shown  to  block  DNA 
replication  (17)  and  mRNA  transcription  (18). 

Substantial  oxidative  modifications  occurring  in  the  DNA  bases  in  breast  carcinogenesis 
(9, 10)  imply  that  significant  modifications  also  occur  in  other  aspects  of  the  DNA,  such  as  in  the 
phosphodiester  backbone  and  deoxyribose  (4).  Consequently,  we  studied  DNA  from  the  normal 
and  cancer-containing  breast  using  Fourier  transform-infrared  (FT-IR)  spectroscopy  in 
conjunction  with  the  complementary  technique  of  gas  chromatography-mass  spectrometry  with 
selected  ion  monitoring  (GC-MS/SIM).  FT-IR  spectroscopy  is  eminently  suited  to  the 
identification  of  these  structural  changes  (19).  DNA  from  normal  reduction  mammoplasty  tissue 
(RMT),  invasive  ductal  carcinoma  (IDC),  and  nearby  microscopically  normal  tissue  (MNT)  was 
analyzed  by  FT-IR  spectroscopy.  Statistical  models  based  on  the  spectral  properties  were 
obtained  and  compared  to  a  statistical  model  previously  used  with  the  base  modifications. 

Substantial  differences  were  found  in  the  spectral  properties  of  DNA  from  women  with 
normal  and  cancerous  breast  tissue,  indicating  an  ability  to  discriminate  the  cancerous  tissue 
from  noncancerous  tissue  with  a  sensitivity  and  specificity  of  83%.  More  importantly,  the 
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normal  population  was  divided  into  subgroups  in  which  a  non-random  progression  was  identified 
and  a  eaneer-like  DNA  phenotype  that  was  highly  correlated  (r  >  0.90)  with  that  of  the  patients 
with  cancer  was  exhibited  in  59%  of  the  women.  The  spectral  data,  which  highly  correlated  with 
the  base-model  data,  were  used  to  establish  a  model  for  predieting  the  probability  of  breast 
eaneer.  Consistent  with  the  high  cancer  reoccurrence  rate  in  the  same  breast,  8  of  10  of  the 
MNTs  remaining  after  tumor  excision  were  classified  as  “cancer”  using  this  model  (10). 

Progressive  structural  changes  in  the  DNA  of  the  normal  female  breast  leading  to  a 
premalignant  cancer-like  phenotype  in  a  high  proportion  of  women  are  the  basis  for  a  new 
paradigm  for  understanding  the  etiology  of  breast  cancer  and  predieting  its  oceurrence  at  early 
stages  of  oncogenesis.  Therapeutic  strategies  for  potentially  reversing  the  extent  of  DNA 
damage,  whieh  may  be  useful  in  disease  prevention  and  treatment,  are  also  suggested  from  the 
findings. 


II.  BODY 

A.  Experimental  Methods 
1 .  Isolation  of  Medaka  liver  DNA 

During  this  grant  period,  the  USABRDL  provided  samples  of  Medaka  liver  exposed  to 
TCE  together  with  controls.  The  samples  were  used  for  the  isolation  of  pure  DNA  from  as  little 
as  2  to  5  mg  of  starting  material.  Our  previous  "macro"  methods  involved  pretreatment  of  the 
sample  with  Proteinase  K  and  RNase  A,  after  which  recovery  of  the  nucleic  acids  from  the 
cellular  lysate  was  accomplished  by  employing  phenol  and  chloroform  to  partition  the  nueleie 


7 


DAMD17-92-2006  Final  Report 


acids  into  an  aqueous  phase  and  the  other  cellular  components,  including  proteins,  into  an 
organic  phase. 

Our  laboratory  employed  a  simpler  and  much  more  efficient  "micro"  method  for  isolation 
of  the  DNA  (See  Table  1).  The  Microprobe  IsoQuick®  Nucleic  Acid  Extraction  Kit  utilizes  the 
properties  of  guanidine  thiocyanate  (GuSCN)  to  both  disrupt  the  cellular  integrity  of  the  sample 
and,  at  the  same  time,  inhibit  the  DNase  and  RNase  activity.  The  GuSCN  is  then  mixed  with  a 
non-corrosive  reagent  containing  a  nuclease-binding  matrix.  The  aqueous  and  organic  phases  are 
separated  by  centrifugation  and  the  DNA  is  precipitated  with  alcohol.  Four  milligrams  of 
purified  Chelex®  100  Resin  (Bio  Rad)  is  added  to  the  DNA  to  remove  any  Fe  ++  that  might  be 
present.  As  a  consequence,  we  have  also  found  that  the  addition  of  Chelex®  100  Resin  results  in 
greater  purification  of  DNA,  yielding  a  higher  A260/A280  spectral  ratio.  The  DNA  is 
quantitated  in  aqueous  solution  by  its  UV  absorption  at  260  nm  using  the  relationship  1 
absorbance  unit  =  50  pg/ml. 

2.  Preparation  of  trimethylsilyl  derivatives 

The  procedure  employed  was  a  modification  of  that  used  previously  (20).  For  example, 
samples  of  purified  DNA  are  now  made  usually  with  30  -  50  pg  and  done  in  either  duplicate  or 
triplicate,  depending  upon  the  availability  of  DNA.  All  are  treated  in  evacuated  sealed  tubes  at 
140°C  for  30  min.  with  0.25  ml  of  concentrated  formic  acid  (60%).  The  treatment  with  formic 
acid  does  not  alter  the  structure  of  the  nucleotide  bases  being  studied.  After  hydrolysis,  the 
samples  are  dried  by  lyophilization.  The  trimethylsilyl  derivatives  are  produced  in  a  0.1  ml  of 
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mixture  of  bis(trimethylsilyl)trifluor-acetamide  (BSTFA)  and  acetonitrile  (4:1)  in 
polytetrafluorethylene-capped  hypo  vials  (Pierce  Chemical  Company)  upon  heating  for  30 
minutes  at  140°C. 

3.  Synthesis  of  oxidized  nucleotide  bases 

Several  of  the  standards  required  for  the  GC/MS-SIM  procedure  were  obtained  from 
commercial  sources;  others  had  to  be  synthesized  in  our  laboratories. 

8 -hydroxy adenine  (8-OH-Ade)  was  synthesized,  using  5-bromocytosine  and  8- 
bromoadenine,  respectively.  These  compounds  were  allowed  to  react  with  95%  concentrated 
formic  acid  at  140°C  for  45  minutes.  Excess  unreacted  formic  acid  was  removed  by  nitrogen 
purge  and  the  product  was  purified  by  recrystalization  from  water. 

The  2,6-Diamino-4-hydroxy-5-formamidopyrimidine  (Fapyguanine;  Fapy-G)  was 
synthesized  from  2,5,6  triamino-4-hydroxypyrimidine  sulfate  and  80%  formic  acid  at  60°C  for 
one  hour.  Excess  formic  acid  was  removed  with  nitrogen.  The  product  of  the  reaction  was 
purified  by  recrystalization  from  water  and  purity  established  by  GC-MS/SIM. 

4.  GC-MS/SIM 

The  analyses  for  oxidized  nucleotide  bases  was  conducted  with  a  Hewlett-Packard  Model 
5890  microprocessor-controlled  gas  chromatograph  interfaced  to  a  Hewlett-Packard  model 
5970B  Mass  Selective  Detector.  The  injector  port  and  interface  were  both  maintained  at  260°C. 
The  column  was  a  fused  silica  capillary  column  (12.0  m,  0.2  mm  inner  diameter)  coated  with 
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cross-linked  5%  phenylmethylsilicone  gum  phase  (film  thickness,  0.33  pm).  The  column 
temperature  was  programmed  from  120°  to  235°C  at  1 0°C/min.  after  2  min.  at  120°C.  Helium 
was  used  as  the  carrier  gas  with  a  linear  velocity  of  57.3  cm/s  through  the  colurtm.  The  amount 
of  TMS  hydrolysate  injected  onto  the  column  was  about  0.5  pg.  Quantitation  of  TMS-nucleotide 
bases  was  done  on  the  basis  of  the  principal  ion  and  confirmation  of  structure  was  undertaken 
using  two  qualifier  ions. 

Several  major  improvements  have  been  made  in  the  GC-MS/SIM  methodology  with  the 
objective  of  "microtizing"  the  DNA  analysis.  A  Hewlett-Packard  Merlin  Microseal®  has  been 
installed  which  decreases  septum  leakage  and,  at  the  same  time,  eliminates  the  presence  of 
septum  particulates  in  the  injection  liner  which  can  cause  activation  of  the  liner.  The  MS 
detector  has  been  changed  to  a  Hewlett-Packard  K-M®  model,  thereby  increasing  the  sensitivity 
of  the  MS  by  about  five-fold.  The  automatic  injector  has  been  altered  so  that  the  syringe  pumps 
each  sample  a  total  of  12  times  with  a  viscosity  delay  of  7  seconds.  The  BSTFA:ACN  solvent  is 
viscous  enough  to  allow  air  bubbles  to  enter  the  syringe  if  a  viscosity  delay  is  not  used.  Though 
minute,  these  air  bubbles  can  cause  a  20-40%  error  in  reproducibility. 

A  most  important  advance  made  in  the  GC-MS/SIM  method  is  the  automation  of  the 
quantitation  procedure.  The  quantitation  files  for  the  base  lesions  have  been  integrated  into  one 
file,  allowing  each  sample  to  be  quantitated  for  all  five  lesions  at  once,  rather  than  by  5  separate 
files.  The  results  are  individually  checked  for  proper  peak  integration  and  then  transferred  to  a 
MS  Excel  database  in  which  the  conversion  from  pg/pl  to  nmol/mg  DNA  (including  all  recovery 
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and  reproducibility  factors)  is  automatically  figured.  The  result  is  in  tabular  form,  and  the  data  is 
readily  converted  to  a  bar  graph  or  other  suitable  depiction. 

5.  Data  collection  and  analysis 

Using  the  GC -MS/SIM  methodology,  characteristic  ions  (one  principal  ion  and  two 
qualifier  ions)  were  employed  to  characterize  the  oxidized  bases;  however,  as  indicated,  the 
principal  ion  was  used  for  quantitation.  All  spectra  were  compared  with  spectra  obtained  from 
commercially  obtained  standards  and  authentic  samples  of  TMS  derivatives  synthesized  in  our 
laboratories.  The  data  obtained  included  SIM  plots  and  derived  mass  spectra.  On  the  basis  of 
the  GC-MS/SIM  data,  oxidized  base  concentrations  in  hepatic  DNA  were  calculated  and 
recorded  as  nmol/mg. 

6.  Isolation  of  female  breast  tissue  DNA 

Initially,  RMT  was  obtained  from  15  patients.  The  tissue  from  10  patients  was 
sequentially  cut  into  1  cm  sagittal  sections,  2  cm  apart.  The  tissue  from  the  remaining  five  RMT 
patients  was  divided  into  two  sections.  Two  to  13  sections  were  obtained  from  each  patient  for  a 
total  of  70  samples.  In  addition,  tumor  (IDC)  and  nearby  MNT  were  obtained  from  the 
cancerous  breasts  of  15  surgical  patients.  This  group  comprised  22  samples,  7  of  which  were 
matched  pairs  (IDC-MNT);  the  remainder  were  single  biopsy  specimens  from  either  IDC  tissue 
or  MNT.  The  RMT  from  the  patients  who  did  not  have  cancer  was  also  microscopically  normal. 
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with  the  exception  of  occasional  incidences  of  non-neoplastic  changes  (e.g.,  fibrocystic).  The 
above-mentioned  tissues  were  used  for  a  study  of  the  DNA  bases  via  GC-MS/SIM. 

Subsequently,  normal  RMT  was  obtained  from  29  patients  living  in  the  Puget  Sound 
region  of  Washington  State;  one  to  four  sections  were  obtained  fi-om  each  patient,  totaling  41 
samples.  Eighteen  IDC  tissues  were  obtained  fi’om  the  breasts  of  1 8  surgical  patients,  and  1 1 
nearby  MNT  specimens  were  obtained  from  an  additional  10  surgical  patients.  The  IDC  and 
MNT  specimens  were  obtained  from  local  hospitals  and  The  Cooperative  Human  Tissue 
Network  (Cleveland,  OH).  The  RMT  and  the  MNT  exhibited  occasional  incidences  of  non¬ 
neoplastic  fibrocystic  changes;  other  cellular  changes  were  essentially  absent.  These  tissues  were 
used  in  studies  involving  FT-IR  spectroscopy  in  conjxmction  with  GC-MS/SIM. 

Each  of  the  excised  tissues  was  frozen  in  liquid  nitrogen  and  maintained  at  -80°C.  The 
DNA  was  isolated  from  ~  350  mg  of  tissue  and  the  purity  was  established  as  previously 
described  (9,  10).  The  DNA  (yield:  ~120-150pg),  dissolved  in  deionized  water,  was  aliquoted 
into  portions  for  GC-MS/SIM;  ~50pg  (9, 10)  and  FT-IR  spectroscopy  (~  20  pg)  and  the  aliquots 
were  dried  completely  by  lyophilization,  purged  with  pure  nitrogen  and  stored  in  an  evacuated, 
sealed  glass  vial. 

7.  Gas  chromatographic-mass  spectral  and  FT-IR  analysis  of  breast  tissues 

Lyophilized  DNA  from  female  breast  tissues  was  hydrolyzed,  derivatized  and  analyzed 
by  the  same  GC-MS/SIM  method  as  used  for  Medaka  liver  (9,  10).  Infrared  (IR)  spectra  of  59 
DNA  samples  were  obtained  with  a  Perkin-Elmer  System  2000  Fourier  Transform-IR 
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spectrometer  (The  Perkin-Elmer  Corp.,  Norwalk,  CT)  equipped  with  an  IR  microscope  and  a 
wide  range  mercury-cadmium-telluride  detector.  The  DNA  was  placed  on  a  BaF2  plate  in  an 
atmosphere  with  a  relative  humidity  of  less  than  -60%  and  flattened  to  make  a  transparent  film. 
Using  the  IR  microscope,  a  imiform  and  transparent  portion  of  the  sample  was  selected  to  avoid  a 
scattering  or  wedge  effect  in  obtaining  transmission  spectra.  Each  analysis  was  performed  in 
triplicate  on  3-5pg  of  DNA  and  the  spectra  were  computer  averaged  and  two  hundred  and  fifty- 
six  scans  at  4  cm'^  resolution  were  performed  to  obtain  spectra  in  a  frequency  range  of  4000-700 
cm'\  Typically  3-5  minutes  elapsed  from  when  the  glass  vial  was  broken  to  when  each  spectrum 
was  obtained.  None  of  the  spectra  showed  a  1703  cm'*  band,  which  is  indicative  of  specific  base 
pairing.  This  fact  indicates  that  the  samples  were  free  from  water  and  had  acquired  a  disordered 
form,  the  D-configuration  (21-23). 

The  spectra  were  obtained  in  transmission  units  and  converted  to  absorbance  units  for 
data  processing.  The  Infrared  Data  Manager  software  package  (The  Perkin-Elmer  Corp.)  was 
used  to  control  the  spectrometer  and  to  obtain  the  IR  spectra.  The  GRAMS/2000  software 
package  (Galactic  Industries  Corp.,  Salem,  NH)  was  used  to  perform  postrun  spectrographic  data 
analysis.  Each  spectrum  was  converted  to  a  spreadsheet  format  giving  a  specific  absorbance  for 
every  wavenumber  between  4000  and  700  cm'*. 

8.  IR  spectral  and  statistical  analysis,  and  correlations  with  the  Fapy-A/8-OE[-Ade  model 

The  IR  spectral  and  statistical  analyses  were  used  to  test  the  hypotheses  that  systematic 
differences  occur  in  DNA  between  normal  and  cancer  tissue  and  that  a  progressive,  non-random 
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modification  of  the  DNA  in  the  normal  breast  culminates  in  a  cancer-like  phenotype.  This 
phenotype  likely  would  constitute  a  premalignant  biomarker  for  cancer  development. 

A  baseline  adjustment  was  made  in  all  spectra  to  remove  the  effect  of  background 
reflectance.  The  mean  across  1 1  wavenumbers,  centered  at  the  lowest  point  (2000  to  1700  cm’’ 
range),  was  subtracted  from  absorbances  at  each  frequency. 

The  specimens  varied  in  thickness,  yielding  diverse  absorbances  or  spectral  intensities 
unrelated  to  cancer  or  noncancer  type.  Because  there  was  not  a  well  established  reference  peak 
in  the  frequency  range  of  interest  to  use  for  normalization,  it  was  achieved  by  converting  all 
absorbances  to  a  constant  mean  intensity. 

The  region  1503  to  761  cm’’  —  a  span  of  743  wavenumbers  -  was  chosen  as  the  primary 
region  for  analysis  because  it  constituted  the  most  common  valley-to-valley  absorbance  range 
among  all  cancer  and  noncancer  spectra.  This  area  of  the  spectrum  was  chosen  disregarding 
cancer/noncancer  status.  Absorbances  at  all  wavenumbers  were  divided  by  the  mean  absorbance 
ranging  from  1503  to  761  cm’’  for  that  spectrum.  This  resulted  in  a  mean  spectral  intensity  of 
1.0  for  each  specimen.  Unless  otherwise  stated,  all  analyses  were  carried  out  on  these  baselined 
and  normalized  spectra. 

The  cancer  and  noncancer  spectral  groups  were  compared  by  plotting  means  for  each  of 
the  two  groups  at  each  frequency.  A  mean  cancer  and  a  mean  noncancer  DNA  spectrum  was 
obtained.  For  patients  with  multiple  specimens,  the  normalized  absorbances  were  averaged  to 
yield  a  single  mean  spectrum  per  patient.  At  each  spectral  frequency,  a  t-test  for  the  difference 
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between  cancer  and  noncancer  normalized  absorbances  was  conducted.  This  yielded  one  P  value 
per  frequency.  The  method  of  Schweder  and  Spjotvoll  (24)  was  used  to  estimate  the  number  of 
jfrequencies  with  cancer/noncancer  differences  that  were  unlikely  to  be  random.  Cancer  and 
noncancer  spectra  were  compared  based  on  several  other  measures.  Such  a  measure  is  distance- 
C,  which  was  calculated  as  follows:  the  18  cancer  spectra  were  taken  as  a  reference  library.  The 
closeness  or  distance  of  each  index  spectrum  from  the  cancer  library  was  measured  by 
calculating  the  Pearson  correlation  coefficient  between  the  index  spectrum  and  the  spectrum  of 
each  member  of  the  cancer  library.  For  each  index  noncancer  spectrum,  the  Pearson  correlation 
of  that  spectrum  with  each  of  the  1 8  spectra  in  the  cancer  library  was  calculated,  yielding  1 8 
correlation  values.  The  mean  of  these  1 8  correlations  was  then  calculated  as  distance-C.  For  an 
index  cancer  spectrum,  only  1 7  spectra  in  the  library  were  used  so  that  a  spectrum  was  not 
compared  to  itself  For  each  index  cancer  spectrum,  17  correlation  values  were  averaged  to  yield 
distance-C.  Thus,  every  spectrum  has  a  single  value  of  distance-C. 

Another  distance  measured,  distance-A,  is  the  usual  straight  line  distance  between  two 
points  in  a  multidimensional  space.  Across  the  743  frequencies  from  1503  to  761  cm'*,  distance- 
A  was  calculated  as  the  square  root  of  the  sum  of  squared  differences  of  normalized  absorbances 
between  two  spectra.  For  a  particular  spectrum,  distance-A  was  averaged  across  all  spectra  in  the 
library.  Differenees  between  cancer  and  noncancer  (speetral)  groups  on  distance-C  and  distance- 
A  were  evaluated  by  t-tests.  The  difference  between  groups  with  respect  to  the  distanee-C 
measure  was  assessed  by  a  simulation  study  or  permutation  test.  In  the  latter  test,  18  of  the  47 
patients  randomly  were  designated  as  having  cancer  and  placed  in  a  library.  This  random 


15 


DAMD17-92-2006  Final  Report 


designation  was  performed  10,000  times  and  a  t-statistic  was  calculated  for  each  of  the  10,000 
trials.  The  proportion  of  simulated  t-statistics  that  exceeded  the  obtained  t-statistic  yielded  an 
empirically  derived  P  value  that  served  to  check  on  the  obtained  P  value. 

The  location  and  normalized  absorbances  of  various  peaks  were  examined.  Four  bands 
are  especially  obvious  in  most  spectra.  These  occur  at  about  1650-,  1410-,  1230-,  and  1060-cm'*. 
Even  though  the  1650-cm‘^  band  was  not  within  the  frequency  range  of  1503  to  761  cm'^  it  was 
included  in  this  comparison  because  it  is  an  easily  identifiable  peak. 

We  wanted  to  determine  if  the  distance  from  the  cancer  library  was  systematically  related 
to  the  shape  of  the  spectrum.  Under  the  null  hypothesis,  cancer  and  noncancer  spectra  differ  only 
randomly.  That  is,  at  a  given  frequency,  a  noncancer  DNA  spectrum  is  just  as  likely  to  occur 
above  as  below  a  cancer  spectrum.  Also,  under  this  hypothesis,  noncancer  spectra  that  are  more 
distant  from  the  cancer  library  would  be  distant  in  a  random  fashion,  sometimes  lying  above  or 
below  the  mean  cancer  spectrum  and  lacking  a  systematic  difference.  Correlation  was  used  to 
test  this  null  hypothesis  for  the  noncancer  group;  each  spectrum  had  only  one  value  of  distance- 
C,  but  only  one  per  integer  wavenumber.  If  the  normalized  absorbances  shifted  in  a  systematic 
way  with  distance  from  the  cancer  library,  then  this  relationship  would  be  expected  to  be 
reflected  in  a  correlation  between  distance  and  normalized  absorbance.  If  there  was  a  systematic 
shift,  either  a  large  positive  or  large  negative  correlation  of  the  normalized  absorbance  across  the 
data  for  the  29  noncancer  patients  would  be  expected  when  their  normalized  absorbances  at  a 
specific  wavenumber  were  plotted  versus  distance-C.  Wavenumbers  that  had  substantial 
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correlation  values  between  normalized  absorbance  and  distanee-C  per  point  regions  of  the 
spectrum  that  shift  in  a  systematic  way.  Thus,  we  calculated  the  Pearson  correlations  between 
distance-C  and  the  normalized  absorbance  for  each  integer  wavenumber.  The  same  value  of 
distance-C  was  used  for  a  particular  spectrum  whenever  it  entered  into  a  correlation  calculation, 
but  the  normalized  absorbance  varied  according  to  the  specific  wavenumber  for  which 
correlation  was  being  calculated.  In  the  range  1503  to  761  cm’\  743  Pearson  correlation 
coefficients  were  calculated.  The  statistical  significance  of  the  correlation  at  eaeh  wavenumber 
was  also  assessed,  and  the  Schweder  and  Spjotvoll  (24)  method  was  used  to  estimate  the  number 
of  frequencies  for  which  there  was  a  true  association  between  distance  from  the  cancer  library 
and  normalized  absorbance. 

Factor  analysis  was  employed  to  study  spectral  variations  and  the  relation  of  these 
variations  to  cancer  versus  noncancer  DNA  status.  The  form  approach  used  was  equivalent  to 
principal  component  analysis.  Analysis  was  performed  on  the  normalized  absorbance  data 
without  the  mean  removed,  so  that  the  first  factor  mainly  reflects  a  mean  spectrum  across  all 
cancer  and  noncancer  DNA.  The  mean  of  the  spectra  for  multiple  sections  from  noncancer 
patients  was  taken  to  obtain  a  single  spectrum  DNA  sample. 

Analyses  were  conducted  to  relate  spectral  characteristics  among  cancer  and  noneancer 
DNA  to  •OH-induced  base  structural  changes  occurring  in  breast  carcinogenesis  as  described 
previously  (10).  In  the  earlier  study,  striking  associations  were  found  between  eancer/noncancer 
status  and  the  ratio  of  ring-opening  base  structures  to  8-OH-adducts.  Thus,  in  the  present  study, 
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statistical  model  based  on  logjo  (Fapy-A/8-OH-Ade),  having  a  high  sensitivity  (91%)  and 
specificity  (96%),  was  used(sensitivity  is  defined  as  the  percentage  of  cancer  patients  correctly 
classified  and  specificity  is  defined  as  the  percentage  of  noncancer  patients  correctly  classified). 
This  logarithm  of  concentration  ratios  was  used  for  29  patients,  10  with  and  19  without  cancer. 
The  Pearson  correlation  was  calculated  between  the  Fapy  A/8-OH-Ade  model  and  the  spectral 
descriptive  characteristics.  For  patients  with  multiple  sections,  the  mean  of  the  logio  ratios  was 
used. 

A  model  for  classifying  the  status  of  patients  with  and  without  cancer  was  developed 
using  logistic  regression  (25).  Logistic  regression  yields  a  model  for  the  probability  that  a 
patient  is  in  the  cancer  group  as  a  function  of  the  descriptive  characteristics  included  in  the 
model. 

B.  RRSUT.TS 

1 .  Studies  on  Medaka  Using  Reduced  Tissue  Sample  Weights 

Preliminary  findings  indicate  that  sufficient  DNA  is  obtained  from  a  single  Medaka  liver 
to  allow  analysis  by  GC-MS/SIM  in  duplicate  (Table  1).  The  modifications  made  in  the  DNA 
extraction  procedure,  the  increased  sensitivity  of  the  Hewlett  Packard  instrument  and  other 
modifications  in  technique  have  made  the  "microtization"  a  reality. 
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Table  1.  Improved  DNA  Extraction  Methodology 


Sample  TD 

Liver  Weight  tmg'l 

AMT.  DNA  tug')* 

A260/A280 

EE3-93-027-17-14 

3.9 

45.7 

1.65 

EE3-93-027-17-5 

6.4 

92.5 

1.82 

EE3-93-027-17-11 

3.0 

40.5 

1.90 

EE3-93-027-18-24 

3.0 

63.8 

1.91 

EE3-93-027-18-27 

3.4 

68.0 

1.84 

EE3-93-027- 18-33 

4.1 

93.8 

1.86 

EE3-93-027-17-26 

3.6 

71.0 

1.91 

EE3-93-027-17-6 

5.7 

30.0 

1.63 

EE3-93-027-17-19 

5.2 

136.0 

1.68 

EE3-93-027-17-20 

2.2 

40.5 

1.80 

EE3-93-027-17-28 

3.6 

65.1 

1.78 

EE3-93-027- 17-29 

5.8 

44.6 

1.75 

EE3-93-027- 17-47 

3.3 

30.0 

1.73 

EE3-93-027- 17-23 

6.3 

60.8 

1.66 

EE3-93-027-17-33 

5.4 

67.3 

1.83 

*  The  significant  differences  obtained  in  yields  are  likely  attributable  to  imprecision  in  “cut- 
points”  in  the  solvent  extraction. 


2.  Application  of  the  GC-MS/SIM  Technique  to  the  Analysis  of  Human  Normal  and  Cancerous 
Female  Breast  Tissues 

As  a  consequence  of  the  work  done  on  this  project,  substantial  •OH-induced  base  lesions 
were  found  in  the  DNA  of  IDC  of  the  female  breast.  However,  virtually  no  information  was 
available  regarding  relationships  between  the  different  base  lesions  in  the  normal  and  cancerous 
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breast.  Such  information  is  essential  in  understanding  initial  stages  in  the  development  of  breast 
cancer  and  the  potential  of  the  base  lesions  as  early  predictors  of  cancer  risk. 

The  •OH-induced  DNA  base  lesions  in  normal  RMT  were  compared  to  those  from  IDC 
eind  nearby  MNT.  Comparisons  were  then  undertaken  on  relationships  between  the  base  lesion 
profiles  in  the  normal  and  cancerous  breast  using  twenty-two  statistical  models. 

DNA  from  the  RMT  was  characterized  by  a  high  ratio  of  ring-opening  products  (e.g.,  4,6- 
diamino-5-formamidopyrimidine)  to  OH-adducts  of  adenine  and  guanine.  A  dramatic  shift  in 
this  relationship  in  favor  of  carcinogenic  hydroxy-adducts  (e.g.,  8-OH-Gua)  was  found  in  the 
cancerous  breast.  Statistical  models  with  a  high  sensitivity  (91%)  and  specificity  (97%)  provided 
a  consistent  means  of  classifying  tissues  (e.g.,  96%  correct). 

The  dramatic  shift  in  the  DNA  base  lesion  relationships  in  oncogenesis  is  attributed  to 
alterations  in  the  redox  potential  of  the  breast  favoring  oxidative  conditions  and  cancer 
formation.  These  findings  suggest  that  base  lesion  profiles  are  potential  sentinels  for  cancer  risk 
assessment.  Further,  intervention  in  controlling  the  tissue  redox  potential  may  provide  benefit  in 
delaying  or  preventing  early  oncogenic  changes  and  the  ultimate  manifestation  of  cancer. 

Graphical  analysis  showed  the  logarithm  of  values  to  be  more  closely  related  to  cancer 
vs.  noncancer  origin  of  tissue  sections  and  more  normally  distributed  than  values  on  the  natural 
scale.  Thus,  we  used  logjo  concentrations  and  logjg  ratios  of  concentrations  in  all  analyses.  The 
concentration  of  HMUra  was  below  the  detection  limit  of  0.0002  nmol/mg  DNA  for  14  sections 
and  these  sections  were  assigned  a  value  of  0.0001  nmol/mg  DNA.  The  mean  values  for  cancer 
and  noncancer  tissue  of  the  logjo  concentrations  and  ratios  and  the  statistical  significance  of 
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differences  were  calculated  using  methods  developed  by  Laird  and  Ware  that,  in  our  case,  take 
account  of  the  dependence  of  multiple  sections  from  individual  patients  (21).  The  method  is 
similar  to  ordinary  multiple  linear  regression  in  other  regards. 

In  order  to  build  a  model  for  predicting  the  origin  of  the  tissue  sections  (cancer  vs. 
noncancer),  we  used  an  extension  of  these  methods  for  binary  variables.  In  our  context,  this  is  a 
model  for  the  probability  that  a  specific  tissue  derives  from  a  cancer  or  a  noncancer  patient.  The 
probability  is  expressed  as  a  function  of  logjQ  concentrations  or  ratios  of  concentrations.  To  use 
it  as  a  predictive  model,  a  cut-off  probability,  P^,  is  selected  (e.g.,  =  0.5)  and  tissue  samples 

with  an  estimated  probability  above  this  value  are  labeled  as  cancer-derived.  We  calculated  the 
sensitivity  and  the  specificity  of  the  classification,  based  on  trial  cut-off  values  from  P^  =0.1  to 
Pc  =  0.9  in  0.1  increments,  and  chose  the  value  of  Pq  that  gave  the  highest  combined  values 

(expressed  as  a  sum)  of  sensitivity  and  specificity. 

We  determined  if  mean  concentrations  or  ratios  of  concentrations  differed  between  MNT 
and  IDC  tissue  from  cancer  patients,  using  the  Laird-Ware  model  with  the  logjQ  values  as 
dependent  variables,  MNT  vs.  IDC  as  a  dichotomous  independent  variable,  and  patient  as  a 
random  effect. 

The  GC-MS/SIM  analyses  revealed  dramatic  differences  in  the  concentrations  of  the 
DNA  base  lesions  between  the  cancerous  breast  and  the  RMT.  Both  the  IDC  and  the  MNT  were 
characterized  by  relatively  high  proportions  of  OH-adducts  produced  via  the  oxidation  of  the 
nucleotide  bases.  The  base  lesions  were  8-OH-Ade,  8-OH-Gua  and  HMUra.  Fapy  derivatives 
(Fapy-A  and  Fapy-G),  which  are  produced  through  reductive  pathways  from  the  initially  formed 


21 


DAMD 17-92-2006  Final  Report 


8-oxyl  derivatives,  were  present  in  relatively  small  concentrations  in  the  cancerous  breast. 
However,  the  relationship  between  the  concentrations  of  the  OH-adduct  and  Fapy  derivatives 
was  dramatically  different  in  the  RMT.  Overall,  a  clear  distinction  was  evident  between  the 
ratios  of  Fapy:8-OH  base  lesion  concentrations  in  the  cancerous  tissue  and  those  of  the  normal 
tissue. 

None  of  the  tissues  examined  showed  evidence  of  inflammatory  responses  during 
histologic  examination.  Thus,  there  is  no  evidence  for  any  contribution  from  infiltrating  cells  in 
the  proportions  of  reported  DNA  lesions.  Each  of  the  IDC  and  MNT  specimens  had  a  mirror- 
image  "control"  histologic  section  prepared  and  examined  in  the  absence  of  any  knowledge  of 
the  DNA  base  lesion  data. 

Fapy-A  concentrations  predominated  in  the  RMT  sections  compared  to  8-OH-Gua  by  a 
factor  of  ~  4-  to  10-fold;  as  depicted  in  Figure  1  below: 

too 

10 

I 

0.1 

o.ot 

O.OOL 

0.0001 

Tissue  Number 

Figure  1 .  A  scatterplot  depicting  the  relationship  between  the  logjo  concentration  of  the  base  lesions 
(nmol/mg  DNA)  versus  tissues  analyzed.  Panel  A:  RMT;  panel  B:  IDC  and  MNT.  Circles  represent 
Fapy-A  and  X  represents  8-OH-Gua. 

Remarkably  high  concentrations  of  Fapy-A  were  found  in  the  RMT  (mean  ±  S.E.=  2.9  ± 
0.49  nmol/mg  DNA;  one  base  lesion  in  320  normal  bases).  Surprisingly,  for  example,  one 
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patient  had  a  RMT  section  that  contained  21.0  nmol  Fapy-A/mg  DNA,  or  one  base  lesion  in  46 
normal  bases.  Overall,  high  concentrations  of  Fapy  derivatives  in  the  RMT  did  not  prevent  the 
formation  of  significant  concentrations  of  OH-adducts  in  some  tissues.  The  tissue  section  from 
the  patient  mentioned  above  had  a  relatively  high  8-OH-Gua  concentration  of  1.1  nmol/mg 
DNA  (one  base  lesion  in  540  normal  bases).  Thus,  two  aspects  relevant  to  •OH-induced 
carcinogenesis  are  the  redox  status  of  the  tissue  and  the  absolute  concentrations  of  mutagenic 
OH-adducts  (e.g.,  8-OH-Gua).  Both  of  these  parameters  would  be  pivotal  in  the  assessment  of 
carcinogenic  risk  factors. 

In  contrast,  the  IDC  and  MNT  sections  were  characterized  overall  by  elevations  in  8-OH- 
Gua  compared  to  RMT,  coupled  with  a  marked  depletion  of  Fapy-A  residues.  Thus,  the  results 
further  indicate  that  fundamental  differences  exist  in  the  nature  of  the  •OH-induced  base  damage 
in  relation  to  cancerous  and  noncancerous  tissues.  This  is  evident  from  the  histogram  shown 
below  in  Figure  2,  which  depicts  the  concentrations  (mean  +  S.E.)  of  the  base  lesions  in  the 
RMT,  MNT  and  IDC  tissues. 


RMT  MNT  IDC 


Figure  2.  A  histogram  of  DNA  base  lesion  values  (mean  ±  standard  error)  for  the  cancerous  (MNT  and 
IDC)  and  normal  female  breast  (RMT).  The  relatively  low  base  lesion  values  for  HMUra  are  designated 
by  a  triangle  (A):  RMT  =  0.0007  ±  0.0001;  MNT  =  0.0021  ±  0.0004;  and  IDC  =  0.0021  ±  0.0005. 
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A  statistical  analysis  of  the  data  was  conducted  yielding  the  mean  values  of  the  various 
indicators  (logjo  concentrations  or  logjg  ratios  of  concentrations)  for  cancer  and  noncancer  tissue 
and  the  statistical  significance  of  differences  using  the  Laird-Ware  regression  model.  The 
sensitivity  and  speeificity  were  calculated  using  the  predictive  logistic  regression  model.  Most 
significance  levels  are  strikingly  small,  indicating  prominent  differences  between  caneer  and 
noncancer  tissue  with  respect  to  a  wide  array  of  predictors.  No  correlation  between  patient  age 
and  predictors  was  observed.  Consequently,  age  was  not  included  in  the  analyses,  neither  in  the 
cancer  dataset  nor  in  the  noncancer  dataset,  and  can  be  ruled  out  as  a  cause  of  the  strong 
association  between  the  base  lesions  and  the  origins  of  tissue  sections.  Examples  of  these  data 
are  presented  in  Table  2. 

Due  to  the  number  of  comparisons  made  (22  predictors  were  assessed;  9  of  high 
sensitivity  and  specifieity  are  given  as  examples  in  the  tabular  data  above),  it  is  likely  that  one  or 
two  would  be  statistically  significant  by  chance  alone.  However,  if  all  the  p-values  determined 
are  multiplied  by  22,  which  is  the  eonservative  Bonferroni  adjustment,  almost  all  of  the  p-values 
would  still  be  statistically  significant,  including  those  for  the  logjo  ratio  that  is  considered  further 
below. 
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Table  2.  Mean  logjQ  concentrations  and  logjQ  ratios  of  concentrations  of  DNA-base  lesions  from 
reductive  mamoplasty  tissue  (RMT)  and  cancerous  breast  tissue  (IDC  and  MNT)  and  sensitivity 
and  specificity  based  on  statistical  models.  * 

Indicator  Non-cancer  Cancer  patients  Predictive  Model 

patients  ILogistic  regression! 


log  10 

( concentrations) 

P-valuef 

Mean§ 

S.E. 

Mean§ 

S.E. 

Sensitivit 

V  (%) 

Specificitv 

m 

P-valuet 

HMUra 

.0000 

-3.3 

.1 

-2.8 

.1 

91 

69 

.0001 

Fapy-A 

logio(ratio  of  cone.) 

.0000 

.2 

.1 

-.7 

.1 

82 

93 

.0000 

Fapy-A/HMUra 

.0000 

3.6 

.1 

2.0 

.1 

91 

97 

.0001 

Fapy-A-8/OH-Ade 

.0000 

.9 

.1 

-.3 

.1 

91 

96 

.0000 

Fapy-A/8-OH-Gua 

.0000 

1.4 

.1 

.2 

.1 

91 

94 

.0001 

Fapy-A/(8-OH-Ade 

+  8-OH-Gua) 

.0000 

.8 

.1 

-.4 

.1 

91 

97 

.0001 

Fapy-A/(8-OH-Ade 

+  8-OH-Gua  + 

HMUra) 

.0000 

.8 

.1 

-.4 

.1 

91 

97 

.0001 

(Fapy-A  +  Fapy-G)/ 

(8-OH-Ade  +  8-OH- 

Gua) 

.0000 

.8 

.1 

-.4 

.1 

91 

97 

.0000 

(Fapy-A  +  Fapy-G)/ 

.0000 

.8 

.1 

-.4 

.1 

91 

97 

.0000 

(8-OH-Ade  +  8-OH- 
Gua  +  HMUra) 

*A11  analyses  are  based  on  n  =  15  cancer  patients  with  a  total  of  22  tissue  samples  and  15  non-cancer  patients  with  a 
total  of  70  tissue  samples. 

jBased  on  linear  regression  random-effects  model,  testing  the  null  hypothesis  of  equality  of  means  for  cancer  and 
non-cancer  (RMT)  patients. 

§Mean  of  the  logjQ  values  for  the  tissue  specimens. 

JBased  on  logistic  regression  random-effects  model,  testing  the  null  hypothesis  that  the  logjo  value  is  not 
associated  with  cancer  vs.  non-cancer  (RMT)  classification  of  tissue  section. 
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We  used  the  model  for  the  logjg  ratio  of  summed  Fapy  derivatives  to  summed  OH- 
adducts  plus  HMUra  beeause  it  is  based  on  reductive  vs.  oxidative  conversion  pathways  of  the 
initial  8-oxyl  derivative  and  is  one  of  the  best  models  for  predicting  the  cancer  vs.  noncancer 
origin  of  tissue.  However,  as  the  above  table  clearly  shows,  there  are  other  models  with  high 
sensitivity  and  specificity  and  very  small  significance  levels.  The  size  of  this  dataset  does  not 
allow  definitive  selection  among  the  several  good  models.  The  predictive  equation  is  ; 

loge  [P/(l-P)]  =  0.76  -  6.34  x  logio  (ratio) 

where  P  is  the  probability  that  a  tissue  sample  derives  from  a  cancer  patient  and  "ratio"  refers  to 
the  ratio  of  the  sum  of  the  two  Fapy  derivatives  to  the  sum  of  the  two  OH-adducts  plus  HMUra. 
The  standard  errors  of  the  constant  term  and  for  the  multiplier  of  the  logip  ratio  in  the  model 
above  are  0.58  and  1.53,  respectively.  Using  the  model  and  the  cut-off  P^  =  0.5,  tissue  samples 
with  an  estimated  probability  P  >  0.5  were  classified  as  cancer-derived,  and  those  with  P  <  0.5 
were  classified  as  noncancer  derived.  The  corresponding  ratio  of  concentrations  that  best  divides 
cancer  from  noncancer  samples  is  1 .32.  As  can  be  seen  in  Table  2,  the  sensitivity  (91%)  and 
specificity  (97%)  are  both  very  high. 

In  addition  to  the  classification  based  on  (Fapy-A  +  Fapy-G)/(8-OH-Ade  +  8-OH-Gua  + 
HMUra),  we  also  show  the  classification  based  on  a  model  of  high  sensitivity  and  speeificity 
using  the  ratio  (Fapy-A/8-OH-Gua).  In  the  latter  model,  the  predictive  equation  is: 

loge  [P/(l-P)]  =  3.71  -  5.51  logio(ratio) 
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The  standard  errors  of  the  intercept  and  multiplier  of  logjp  are  1.20  and  1.38,  respectively.  The 
cut-point  for  the  predictive  probability  used  in  classifying  a  cancer-derived  tissue  is  >  0.4, 
which  corresponds  to  a  ratio  of  concentrations  of  5.6  or  less. 

The  comparison  of  logjg  concentrations  and  ratios  between  IDC  and  MNT  showed  no 
statistical  differences,  thus  indicating  that  the  observed  DNA  base  modifications  were  pervasive 
in  both  the  IDC  and  MNT.  However,  due  to  the  small  sample  size  (n  =  22  sections),  large 
differences  in  concentrations  or  ratios  between  IDC  and  MNT  cannot  be  ruled  out. 


Based  upon  the  pronounced  differences  in  base  lesion  profiles  and  concentrations 
between  the  cancer  and  noncancer  tissue,  a  graph  of  predicted  probability  of  the  cancerous  origin 
of  a  tissue  vs.  logig  of  the  concentration  ratios  was  constructed.  This  demonstrates  the  strong 
ability  of  this  model  to  discriminate  the  nature  of  each  tissue.  The  data  are  given  below  in  Figure 
3: 


Logio  of  Concentration  Ratio 


Figure  3.  The  predicted  probability  of  the  cancerous  origin  of  a  tissues  is  plotted  with  logjo  of 
the  concentration  ratio  (Fapy-A  +  Fapy-G/(8-OH-Ade  +  8-OH-Gua  +  HMUra)  for  all  samples 
analyzed. 
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It  is  known  that  oxidative  stress  is  linked  to  cancer  formation  and  that  increases  in  OH- 
adducts  (e.g.,  8-OH-Gua)  are  a  likely  consequence  of  oxidative  conditions  in  the  cell.  Consistent 
with  this  is  the  concept  that  the  oxidative  modifications  of  DNA  structure  reported  in  breast 
cancer  are  the  probable  basis  for  the  carcinogenic  action  of  H2O2  generation.  Although  multiple 
biochemical  processes  may  be  involved,  it  is  suggested  that  the  •OH  may  arise  as  a  consequence 
of  the  formation  of  H2O2  from  redox  cycling  of  endogenous  (e.g.,  hormones)  or  exogenous 
effectors  (e.g.,  polychlorinated  biphenyls  [PCBs]  and  chlorinated  hydrocarbons),  mediated  by 
cytochrome  P-450  and  cytochrome  P-450  reductase  (27). 

It  is  noteworthy  that  breast  tissues  of  women  with  breast  cancer  have  elevated 
concentrations  of  PCBs  compared  to  those  with  benign  breast  disease  (28).  In  this  regard,  the 
previously  reported  (29)  relationship  between  fat  intake  and  HMUra  in  DNA  of  peripheral 
nucleated  blood  cells  of  women  with  breast  cancer  may  reflect,  at  least  in  part,  the  influence  of 
organic  xenobiotics  enriched  in  the  dietary  fat. 

The  H2O2,  which  is  readily  transported  across  the  nuclear  membrane,  is  likely  converted 
to  the  •OH  via  the  Fe++-catalyzed  Fenton  reaction.  The  subsequent  attack  of  the  •OH  on  the 
nucleotide  bases  results  in  the  formation  of  the  8-oxyl  derivatives  of  the  purines  and  the 
hydroxylation  of  thymine  to  form  HMUra.  At  this  point,  the  conversions  of  the  purines  can 
either  lead  to  oxidatively-formed  OH-adducts  that  potentially  increase  cancer  risk  or  to 
reductively-formed  Fapy  derivatives  that  are  putatively  non-genotoxic.  The  synthesis  of  the 
ring-opening  structures  appears  to  protect  the  DNA  from  potentially  mutagenic  OH-adduct 
formation  and,  as  such,  reflects  a  unique  antioxidant  role  for  the  DNA  base  structure.  Strikingly, 
the  nature  of  these  transformations  occurring  in  the  cell  leading  to  differing  classes  of  base 
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lesions  is  entirely  consistent  with  the  redox-coupled  pathways  of  •OH-induced  purine 
modifications  occurring  in  aqueous  solution  as  described  by  Steenken  (30).  It  is  particularly 
noteworthy  that  oxidants  (e.g.,  O2)  in  aqueous  solution  quantitatively  suppress  Fapy  derivatives 
and  increase  the  yield  of  8-OH-adducts.  In  view  of  this,  we  were  not  surprised  that  the  most 
effective  predictive  models  shown  in  Table  2  [e.g.,  (Fapy-A  +  Fapy-G/8-OH-Ade  +  8-OH-Gua  + 
HMUra)]  were  completely  consistent  with  the  above-mentioned  pathways.  The  proposed 
pathway  for  the  synthesis  of  the  OH-adducts  and  Fapy  derivatives  is  given  below  in  Figure  4: 


R 


Figure  4.  A  proposed  scheme  for  the  formation  of  the  ring-opening  (Fapy)  derivatives  and  8- 
OH-adducts  in  the  female  breast.  As  an  example,  adenine  is  converted  to  the  8-oxyl  derivative 
(A80H-)  via  the  attack  of  the  -OH.  The  -OH  is  derived  from  the  Fe^-catalyzed  conversion  of 
H2O2  (pathway  4).  The  H2O2  may  arise  from  multiple  metabolic  processes  occurring  in  the 
breast  epithelial  cells,  one  of  which  may  include  the  redox  cycling  of  an  endogenous  or 
exogenous  effector  molecule  (E)  via  cytochrome  P-450  oxidase  (pathway  2)  and  cytochrome  P- 
450  reductase  (pathway  1).  The  A80H-  can  be  converted  oxidatively  to  8-OH-Ade  (pathway  b) 
or  reductively  to  Fapy-A  (pathway  a  or  c).  The  redox  balance  in  the  breast  cells  would  dictate 
the  ratio,  for  example,  of  8-OH-Gua:Fapy-A  formed  with  increases  in  cellular  oxidants  favoring 
pathway  b  and  potential  cancer  formation.  The  cytochrome  P-450  pathways  are  essentially  as 
deseribed  by  Deodatta  et  al.*^  and  the  aqueous  solution  redox  chemistry  and  transformation 
reactions  are  based  on  those  described  by  Steenken. 
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We  conclude  that  the  •OH-induced  oxidative  base  damage  likely  represents  an  event  of 
considerable  importance  in  the  early  development  of  breast  cancer.  For  example,  the  DNA  from 
several  sections  of  the  normal  breast  contained  greater  than  one  8-OH-Gua  base  lesion  in  1,000 
normal  bases.  The  presence  of  elevated  levels  of  8-OH-Gua  in  the  DNA  of  a  relatively  small 
number  of  normal  breast  sections  is  perhaps  to  be  anticipated  considering  the  fact  that  one  out  of 
eight  women  develop  breast  cancer  on  a  lifetime  basis.  In  this  context,  the  attack  of  the 
•OH  on  the  base  structure  of  the  breast  DNA  would  be  expected  to  result  in  the  activation  or 
augmentation  of  nuclear  oncogenes  and  the  deregulation  of  tumor  suppressor  genes,  such  as  p53 
(31).  Other  genotoxic  changes  are  likely  and  the  greater  the  intensity  of  the  radical  attack,  the 
greater  the  expectation  of  mutagenic  events  occurring. 

In  considering  the  proposed  role  played  by  cellular  redox  conditions  and  base  lesion 
formation  in  the  etiology  of  breast  cancer,  it  was  recognized  that  DNA  repair  may  potentially  play 
a  significant  part  in  processes  that  govern  these  circumstances.  Enzymes  capable  of  repairing 
Fapy  and  8-hydroxypurine  derivatives  are  known  to  be  constitutively  expressed  in  E.  coli  and 
mammals  (32).  Moreover,  growing  evidence  indicates  that  one  of  these  enzymes,  the  FGP  protein, 
is  involved  in  the  repair  of  both  Fapy  and  8-hydroxy  base  lesions  (33).  Although  the  8-hydroxy- 
dG  derivative  may  result  in  some  inhibition  of  DNA  replication,  more  specifically  it  is  known  to 
be  mutagenic,  resulting  in  miscoding  lesions  due  to  a  l-to-2%  level  of  misrepair  (34).  However, 
there  is  no  current  evidence  supporting  a  mutagenic  property  for  the  ring-opening  lesions.  Instead, 
the  Fapy  residues  have  been  shown  to  block  DNA  synthesis  (17).  Thus,  unrepaired  Fapy  residues, 
which  are  abundant  in  the  DNA  from  the  normal  breast,  would  not  be  expected  to  be  genotoxic, 
although  they  may  be  cytotoxic.  For  differential  DNA  repair  to  explain  the  present  findings,  the 
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transition  between  high  ratios  of  Fapy  :  hydroxy  derivatives  in  the  RMT  to  low  ratios  in  the  IDC 
and  MNT  would  be  expected,  for  example,  to  involve  preferential  repair  of  the  Fapy- A  residue 
while  the  8-hydroxy  derivatives  increased.  This  circumstance  does  not  conform  to  the  known 
behavior  of  the  DNA  repair  mechanisms  involved.  In  fact,  support  for  our  hypothesis  for 
oxidation-driven  base  lesion  changes  during  oncogenesis  in  breast  cancer  includes  evidence  for 
decreased  DNA  repair  in  cancers  of  the  breast,  colon  and  lung  (35),  the  presently  demonstrated 
increased  concentrations  of  OH-adducts  in  the  cancerous  breast,  and  the  finding  that  trans- 
tamoxifen  exerts  an  antioxidant  effect  (i.e.,  a  decrease  in  tumor  promoter-induced  H2O2  formation 
in  human  neutrophils)  that  correlates  with  diminished  concentrations  of  oxidatively-formed 
HMUra.  Of  additional  significance  is  the  fact  that  patients  with  a  single  breast  cancer  are  at 
increased  risk  of  having  a  second  primary  tumor  in  the  breast  (36).  Our  findings  showing  that 
logjo  base  concentrations  and  ratios  between  IDC  tissue  and  MNT  were  not  statistically  different  is 
consistent  with  this  finding.  That  is,  significant  oxidative  base  damage  in  the  DNA  would  be 
expected  to  still  be  present  in  the  MNT  after  the  tumor  is  removed,  thus  potentially  increasing  the 
risk  of  a  second  tumor  occurring. 

Regarding  the  statistical  models,  the  sensitivity  and  specificity  calculated  from  our 
specific  dataset  (Table  2)  can  be  expected  to  be  somewhat  high  (non-conservative)  compared  to 
the  specifieity  and  sensitivity  that  would  be  calculated  from  a  trial  of  the  predietive  equation  with 
a  new  population  of  tissue  samples.  This  bias  occurs  because  the  sensitivity  and  specificity  have 
been  optimized  within  this  specific  dataset.  The  statistical  significance  calculations,  however, 
are  unbiased  for  inferenee  about  a  similar  mix  of  RMT  and  cancer  patients  as  observed  here. 
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Thus,  the  promise  of  this  area  is  very  strong  (based  on  significance  levels),  but  specific  screening 


models  should  be  based  on  a  larger  dataset  with  more  cancer  patients  and  normal  individuals. 


The  sensitivity  and  specificity  in  this  study  have  been  calculated  for  classification  of 
tissue  samples  and  not  for  classification  of  individual  patients.  However,  multiple  tissue  samples 
from  patients  are  very  likely  to  be  classified  consistently.  For  example,  classification  of  tissue 
sections  based  on  logio(Fapy-A  +  Fapy-G)/(8-OH-Ade  +  8-OH-Gua  +  HMUra)  and  logio(Fapy- 
A/8-OH-Gua)  showed  only  4/92  and  6/92  incorrect  classifications  of  cancer  vs.  noncancer  tissue, 
respectively  (Table  3).  Thus,  the  method  is  most  promising  for  use  in  classification  of  patients 
based  on  individual  samples  of  tissue. 

Table  3.  Classification  of  tissue  sections  using  predictive  model  based  on  ratio  of 
concentrations:  (Fapy-A  +  Fapy-G)/(8-OH-Ade  +  8-OH-Gua)  or  (Fapy-A)/(8-OH-Gua)  *. 


Patient  #, 

Classification  of  sections  based  on 

Classification  of  sections 

tvpe  of  patient 

(Fapy-A  +  Fapy-G/(8-OH-Ade  + 

based  on  (Fapy-A)/(8-OH-Gua) 

8-OH-Gua  +  HMUra) 

N  incorrect/N  total 

N  incorrect/N  total 

1,RMT 

0/7 

0/7 

2,  RMT 

0/13 

2/13 

3,RMT 

0/4 

1/4 

4,  RMT 

0/5 

1/5 

5,  RMT 

1/5 

0/5 

6,  RMT 

0/5 

0/5 

7,  RMT 

0/5 

0/5 

8,  RMT 

0/6 

0/6 

9,  RMT 

0/5 

0/5 

10,  RMT 

0/2 

0/2 

11,  RMT 

0/2 

0/2 

12,  RMT 

0/2 

0/2 

13,  RMT 

0/2 

0/2 

14,  RMT 

1/5 

0/5 

15,  RMT 

0/2 

0/2 

16, cancer 

0/1 

0/1 

17, cancer 

0/1 

0/1 

1 8,  cancer 

0/1 

0/1 

19,  cancer 

0/1 

0/1 

20,  cancer 

0/1 

0/1 

21,  cancer 

0/1 

0/1 
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Table  3  cant’d... 

Patient  #,  Classification  of  sections  based  on 
type  of  patient  (Fapy-A  +  Fapy-G/(8-OH-Ade  + 

8-OH-Gua  +  HMUra) 

N  incorrect/N  total 


Classification  of  sections 
based  on  (Fapy-A)/(8-OH-Gua) 
N  incorrect/N  total 


22,  cancer 

0/1 

0/1 

23,  cancer 

0/1 

0/1 

24,  cancer 

0/2** 

0/2** 

25,  cancer 

0/2** 

0/2** 

26,  cancer 

0/2** 

0/2** 

27, cancer 

2/2** 

2/2** 

28,  cancer 

0/2** 

0/2** 

29, cancer 

0/2** 

0/2** 

30,  cancer 

0/2** 

0/2** 

Total 

4/92 

6/92 

♦Tissue  was  classified  as  derived  from  a  cancer  patient  if  ratio  of  concentrations  (Fapy-A  +  Fapy-G)/ 

(8-OH-Ade  +  8-OH-Gua  +  HMUra)  <  1.32  (second  column)  or  (Fapy-A)/(8-OH-Gua)  <  5.6  (third  column). 

♦♦Represents  paired  IDC  and  MNT  from  the  same  patient. 

The  models  considered  were  based  on  retrospective  analysis  of  the  origin  of  the  tissue. 
The  curve  displaying  the  probability  of  cancer  vs.  the  logjo  of  base  lesion  concentration  ratios 
clearly  affirms  the  difference  in  the  results  between  the  cancer  and  noncancer  patients  (See 
Figure  3).  Given  the  biological  implications  of  the  differing  classes  of  base  lesions,  which  are 
formed  as  a  function  of  the  cellular  redox  potential  as  discussed  above,  it  is  reasonable  to 
conclude  that  this  may  represent  a  basis  for  prospective  cancer  risk  to  be  estimated  through  log 
transformations  of  base  lesion  concentrations  in  the  DNA  of  breast  tissues.  In  this  regard,  it  is 
noteworthy  that  the  probability  model  classifies  certain  of  the  tissues  examined  as  having  base 
lesion  concentrations  that  may  reflect  transitional  states  between  those  of  normal  and  cancerous 
tissue.  Evaluation  of  the  potential  risk  of  an  individual  developing  breast  cancer  at  an  early  stage 
represents  an  important  potential  of  this  analysis. 
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The  model  that  predicts  cancer  vs.  noncancer  status  may  thus  also  predict  future  risk  as 
well.  Evaluation  of  the  present  methodology  for  clinical  application  would  require  a  prospective 
study  of  women  at  variable  predicted  risks  for  developing  breast  cancer,  based  on  models  such  as 
we  have  developed  here.  Such  a  study  would  naturally  include  an  evaluation  of  relationships 
between  diet,  ethnic  differences,  reproductive  history,  familial  history,  and  other  relevant  factors. 
If  prospective  studies  confirm  our  results,  individuals  identified  to  have  a  heightened  predicted 
cancer  risk  would  be  expected  to  benefit  from  close  monitoring  and  possible  intervention  with 
antioxidants  or  other  agents.  The  close  association  between  cancer  chemoprotection  and 
compounds  with  antioxidant  activity  (37-40)  is  consistent  with  this  potential  and  the  results 
presented  in  this  paper. 

In  conclusion,  it  is  clear  that  the  DNA  base  lesion  profiles  reflect  intrinsic  differences  that 
exist  between  normal  and  cancer-derived  tissues  in  a  manner  regulated  by  the  redox  condition  of 
the  breast  cells.  The  nature  of  these  base  lesions  present  in  the  tissues  represents  a  useful  sentinel 
for  evaluating  the  prevailing  redox  conditions.  Further,  potential  mutagenic  damage  to  the  DNA 
base  structure  can  be  assessed.  On  this  basis,  it  is  a  logical  assumption  that  a  shift  in  the  base 
profiles  characteristic  of  normal  breast  tissue  to  profiles  characteristic  of  cancer  tissue  is  early 
evidence  for  a  heightened  risk  of  cancer  formation.  In  this  context,  the  results  presented  describe 
a  potentially  powerful  method  for  defining  characteristic  changes  in  the  DNA  of  female  breast 
tissue  during  oneogenesis.  Given  the  fact  that  analyses  can  be  performed  readily  on  small 
amounts  of  biopsied  tissue,  this  method  could  ultimately  have  wide  application  for  determining 
individuals  at  risk  in  the  population. 
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3.  Application  of  FT-IR  Techniques  to  the  Analysis  of  Human  Normal  and  Cancerous  Female 
Breast  Tissues 

Alterations  in  the  FT-IR  spectra  between  normal  and  cancerous  female  breast  tissues  are 
significant  in  a  number  of  areas  of  the  spectrum  investigated.  To  illustrate,  the  mean  DNA 
spectrum  (2000-700  cm'*)  for  patients  with  and  without  cancer  is  shown  in  Fig.  5.  The  main 
features  of  the  spectral  profiles  resemble  those  of  DNA  from  normal  tissues  obtained  previously 
in  studies  using  IR  spectroscopy  (19, 41).  For  example,  the  area  1700  to  1500-cm’*  was  assigned 
to  strong  CO  stretching  and  NH2  bending  vibrations,  and  1550-1300  cm'*  was  assigned  to  weak 
NH  vibrations  and  CH  in-plane  base  deformations;  1240  cm'*  represents  medium  PO2. 
antisymmetric-stretching  vibrations  of  the  phosphodiester  backbone;  1 100  to  900  cm'*  represents 
strong  P02.symmetric  stretching  vibrations  (~1080  cm'*),  the  CO-stretching  vibrations  of  the 
deoxyribose  moiety  and  the  PO-stretching  vibrations  of  the  PO  group  of  the  phosphodiester 
backbone.  The  lower  portion  of  Fig.  5  illustrates  the  P  values  associated  with  tests  for 
differences  between  cancer  and  noncancer  spectra  at  each  wavenumber.  A  large  P  value  suggests 
that  there  was  no  difference  between  cancer  and  noncancer  spectra,  whereas  a  small  P  value 
indicates  that  the  P  value  was  significant.  The  unequal  variance  version  of  the  t-test  was  used 
because  patients  without  cancer  had  more  diverse  spectra  than  did  those  with  cancer.  As  shown 
in  Fig.  5,  a  number  of  frequency  areas  have  P  values  less  than  0.05.  Among  the  743  P  values 
from  1503  to  761  cm'*,  the  Schweder  and  Spotjvoll  (28)  method  suggested  that  the  null 
hypothesis  was  likely  to  be  false  at  approximately  300  frequencies.  That  is,  the  difference 
between  cancer  and  noncancer  spectra  was  real  at  approximately  300  frequencies.  The  spectra 
were  not  uniform  in  the  occurrence  of  cancer  and  noncancer  differences.  The  P  values  on  the 
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right  side  of  Fig.  5  are  generally  smaller  than  those  on  the  left  side  and  most  small  P  values 
occurred  at  1200  cm'^  or  lower.  The  small  P  values  identified  areas  in  which  cancer  and 


noncancer  spectra  were  notably  different.  These  include  approximately  the  1720-,  1 170-,  1070-, 
1030-,  940-,  860-  and  790-cm'*  areas. 


Wavenumber  (cm-i) 


Figure  5.  Mean  normalized  absorbance  spectra  of  patients  with  cancer  (n=18)  and  without 
cancer  (n=29)  and  the  statistical  significance  of  cancer  versus  noncancer  absorbances,  based  on 
the  t-test  with  unequal  variances.  P  values  were  not  adjusted  for  multiple  testing  (mean  spectra, 
top;  P  values,  bottom). 
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Using  distance-C,  based  on  the  mean  Pearson  correlation,  a  global  test  that  is  not 
influenced  by  the  problem  of  multiple  P  values  also  was  used  to  assess  the  difference  between 
cancer  and  noncancer  spectra.  For  the  primary  range  of  interest,  1503-761  cm'\  distance-C  for 
patients  with  and  without  cancer  was  0.93  and  0.87,  respectively.  Using  the  unpaired  f-test  with 
imequal  variances,  P  =  0.003  for  the  cancer/noncancer  difference.  In  addition,  a  permutation  test 
yielded  P  =  0.004  for  this  difference.  These  results  are  shown  in  Table  4,  together  with  tests  for 
other  spectral  features  including  distance-A  and  spectral  intensities  and  locations  for  the  four 
distinctive  peaks  between  2000  and  700  cm'^  (peaks  A,  B,  C,  and  D,  reading  from  highest  to 
lowest  frequency). 

It  is  notable  that  the  noncancer  spectra  were  more  diverse  than  the  cancer  spectra  and 
generally  more  dissimilar.  Among  the  patients  without  cancer,  12  of  29  (41%)  had  distance-C 
lower  than  0.9  (lower  values  indicate  greater  distance)  whereas  among  the  patients  with  cancer, 
only  2  of  18  (1 1%)  had  distance-C  lower  than  0.9. 

Table  4.  Comparison  of  spectral  descriptive  characteristics  for  cancer  (n=18)  and  noncancer 
(n=29)  patients. 


Item 

Cancer 

Noncancer 

P-value 

mean  (±SD) 

mean  (±SD) 

Distance  C:  mean  Pearson  correlation 

0.003*^ 

with  cancer  library  for  1503-761  cm'^ 

0.93  (±0.03) 

0.87(±0.10) 

Distance  A: airline  distance  from 
cancer  library  for  1503-761  cm'^ 

5.4  (±0.8) 

7.7 

(±3.6) 

0.01^ 

Peak  intensities  and  wavenumber 
location 

• 

Peak  A  in  1652  cm'^  area 

Normalized  absorbance 

2.4  (±0.5) 

2.3 

(±0.5) 

0.6 

Location  (cm’^) 

Peak  B  in  1410  cm''  area 

1652  (±6) 

1652 

(±7) 

0.9 

Normalized  absorbance 

1.1  (±0.2) 

1.2 

(±0.3) 

0.4 

Location  (cm'') 

1412  (±6) 

1407 

(±10) 

0.02^ 
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Table  4  cant’d... 


Item 

Cancer 
mean  (±SD) 

Noncancer 
mean  (±SD) 

P-value 

Peak  C  in  1233  cm"'  area 
Normalized  absorbance 

1.3  (±0.2) 

1.2  (±0.3) 

0.09 

Location  (cm"') 

1232  (±7) 

1235  (±7) 

0.1 

Peak  D  in  1061  cm"'  area 
Normalized  absorbance 
Location  (cm'*) 

2.4  (±0.2) 
1060  (±6) 

2.3 

1062  (±0.3) 

o.it 

Logio  (Fapy-A/8-OH-Ade) 

-0.31  (±0.35) 

(±13) 

0.39  (±0.61) 

0.4t 

0.001^ 

(n=10) 

(n=19) 

SD:  standard  deviation 
*  P-value  based  on  permutation  test  is  0.004. 
^  t-test  based  on  unequal  variances. 


a.  Systematic  shifts  in  the  spectra  toward  a  cancer-like  phenotype 

The  possibility  was  investigated  that  those  noncancer  spectra  that  are  distant  from  the 
cancer  library  differ  in  a  systematic  rather  than  in  a  random  way  from  the  cancer  spectra.  The 
larger  distances  (smaller  distance-C  values  between  a  given  spectrum  and  the  cancer  library)  can 
occur  in  one  of  two  ways.  First,  the  difference  may  be  random  in  that  at  a  given  frequency,  those 
spectra  that  are  distant  sometimes  lie  above  and  sometimes  lie  below  the  mean  cancer  library 
spectrum  in  a  random  fashion.  Alternatively,  at  a  given  frequency  the  spectra  with  greater 
distances  generally  may  lie  below  (or  above)  the  mean  cancer  DNA  spectrum.  To  investigate  the 
possibility  of  a  systematic  shift  in  the  spectrum  with  distance,  the  Pearson  correlation  of 
normalized  absorbance  with  distance  from  the  cancer  library  (expressed  as  distance-C)  was 
calculated  for  each  frequency.  The  results  in  Fig.  6  show  some  strikingly  large  correlations.  If 
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the  distance  from  the  library  was  not  associated  with  increasingly  positive  or  negative  deviations 
around  the  mean  cancer  profile,  then  most  of  the  correlations  in  Fig.  6  would  lie  close  to  zero  and 
between  the  upper  and  lower  (P  =  0.05)  horizontal  lines.  Instead  there  were  wide  fluctuations  in 
the  correlations,  with  a  number  lying  even  beyond  r  =  ±0.6,  the  horizontal  line  for  P  =  0.001. 
Using  the  Schweder  and  Spotjvoll  (24)  method  approximately  500  of  the  743  frequencies  of 
1503-761  cm'^  were  estimated  to  violate  the  null  hypothesis  of  no  association  between 
absorbance  and  distance  from  the  cancer  library.  The  most  significant  correlation  occurred  at 
1172  cm'^,  in  which  r  =  -0.85,  and  the  unadjusted  P  =  5  x  10'^.  Even  multiplying  this  P  value  by 
743  for  a  highly  conservative  Bonferroni  correction  for  multiple  testing  (42)  still  yields  a  P  value 
of  4  X  10'^.  Thus,  the  null  hypothesis  of  no  association  between  distance  from  the  cancer  library 
and  spectral  absorbance  can  be  rejected  decisively.  Again,  it  is  important  to  remember  that 
spectra  must  converge  necessarily  on  the  cancer  profile  as  the  distance  becomes  smaller 
(distance-C  approaches  1 .0).  However,  if  the  process  is  random  (the  null  hypothesis),  there  is  no 
reason  why  spectra  at  a  given  frequency  should  approach  the  mean  cancer  library  from  one  side 
rather  than  from  randomly  above  or  below  it.  Fig.  7  shows  the  consistent  change  of  the 
noncancer  spectrum  for  two  frequencies.  These  are  both  in  the  area  1250-1000  cm  '  in  which  the 
cancer/noncancer  differences  are  most  striking,  based  on  the  t-test. 
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p=0.001 

p=0.01 

p=0.05 


p=0.05 

p=0.01 

p^O.OOl 


Figure  6.  Pearson  correlation  coefficient  between  normalized  absorbance  and  distance-C  from 
cancer  library.  Horizontal  lines  indicate  unadjusted  P  values.  The  smallest  P  value  occurs  at  1 172 
cm''  in  which  the  unadjusted  P  =  5  x  10’^. 

To  determine  if  consistent  changes  were  also  occurring  within  the  cancer  group,  we  used 
analyses  similar  to  those  presented  earlier  that  showed  consistent  changes  in  the  noncancer 
spectra  in  relative  to  distance  from  the  cancer  library.  In  this  case,  the  noncancer  spectra  furthest 
from  the  cancer  library  served  as  a  reference  library.  The  analyses  show  that  as  distance  between 
the  cancers  and  this  group  of  noncancers  increases,  some  of  the  frequency  areas  show  a  trend  of 
increasing  or  decreasing  normalized  absorbance,  rather  than  random  variation. 
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Wavenumber  1172  (cm-i)  Wavenumber  1234  (cm-i) 


Distance-C  Distance-C 

Figure  7.  The  relationship  of  the  distance  of  patients  without  cancer  form  the  cancer  library 
(distance-C)  and  normalized  absorbances  at  frequencies  1172  and  1234  cm"*.  Patients  who  were 
closer  to  the  cancer  library  had  an  increasingly  higher  absorbance  at  1234  cm"’  and  an 
increasingly  lower  absorbance  at  1 172  cm"*. 

b.  Relationship  of  spectral  features  to  base  modifications 

Some  of  our  spectral  descriptive  measures  were  related  to  the  Fapy-A/8-OH-Ade  base 

model  (9).  The  logarithm  of  the  ratio  of  base  concentrations  was  correlated  significantly  with 

several  of  the  spectral  descriptive  measures  as  shown  in  Table  5.  This  analysis  has  reduced 

significance  compared  to  analyses  based  solely  on  spectral  data  because  of  the  smaller  number  of 

patients  with  GC-MS/SIM  data  (n  =  10  cancer;  n  =  19  noncancer).  Still,  several  of  the 

correlations  were  statistically  significant  or  had  correlations  of  ±0.3  or  larger.  These  correlations 

of  independently  derived  spectral  measures  with  a  GC-MS-based  measure  of  cancer  risk 

mutually  support  the  finding  that  both  approaches  are  able  to  discriminate  between  patients  with 

and  without  cancer  and  establish  potential  cancer  risk  levels. 
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Table  5.  Pearson  correlation  of  spectral  descriptive  characteristics  with  logjg  ratio  of 
concentrations  ofFapy-A  to  8-OH-Ade  (n-10  cancer  and  n=19  noncancer  patients). 


Item  Correlation  P-value 


Distance  C:  Pearson  correlation  with  cancer 

library  for  1503-761  cm'*  -0.53  0.003 

Distance  A:  airline  distance  from  cancer 

library  for  1503-761  cm''  0.39  0.04 

Peak  intensities  and  wavenumber  location 
Peak  A  in  1 652  cm"'  area 


Normalized  absorbance 

0.22 

0.3 

Location  (cm"') 

-0.22 

0.3 

Peak  B  in  1410  cm"'  area 

Normalized  absorbance 

0.33 

0.08 

Location  (cm"') 

-0.59  (-0.48)* 

0.0008 

(0.008)* 

Peak  C  in  1233  cm''  area 

Normalized  absorbance 

-.45 

0.01 

Location  (cm"') 

0.21 

0.3 

Peak  D  in  1061  cm"'  area 

Normalized  absorbance 

-.48 

0.008 

Location  (cm'') 

0.57  (0.39)* 

0.001  (0.03) 

*  The  Spearman  correlation  coefficient  and  its  p-value  are  shown  in  parentheses  when  an  outlier  may 
have  influenced  the  Pearson  correlation  coefficient. 


c.  Breast  cancer  risk  model 

The  spectral  descriptive  measures  were  combined  into  a  model  for  cancer  risk 
assessment.  Table  6  shows  the  model,  based  on  logistic  regression,  along  with  the  observed 
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sensitivity  and  specificity  for  this  particular  data  set  using  the  first  six  factors  derived  earlier. 

The  association  between  the  factors  and  cancer/noncancer  status  is  significant  {P  =  0.001);  the 
sensitivity  and  specificity  are  both  83%.  These  sensitivity  and  specificity  values,  however,  are 
not  unbiased  because  the  cut-point  value  of  predicted  probability  used  to  classify  patients  as 
having  or  not  having  cancer  (Probability  >  0.4)  was  based  on  inspection  of  the  predicted 
probabilities.  Nevertheless,  the  statistical  significance  of  the  association  of  the  six  factors  with 
cancer/noncancer  status  (P  =  0.001)  was  unbiased.  The  predicted  probability  of  cancer  as  a 
function  of  this  model,  which  produces  a  risk  score,  is  shown  in  Fig.  8.  The  plot  shows  that  there 
was  some  overlap  of  patients  with  and  without  cancer  in  the  middle  range  of  predicted 
probabilities,  whereas  at  the  lowest  and  highest  levels  of  predicted  probability  patients  with  and 
without  cancer  are  separated  clearly. 

Table  6.  Predictive  models  for  cancer  V5.  noncancer  status  based  on  a  logistic  regression 
analysis  of first  six  factors  from  factor  analysis. 


Principal 

Components 

Model  Coefficient 

S.E. 

P-value 

Factor  1 

19.0 

8.8 

0.03 

Factor  2 

0.01 

0.24 

1.0 

Factor  3 

6.8 

3.0 

0.02 

Factor  4 

3.6 

1.7 

0.04 

Factor  5 

-3.9 

1.7 

0.02 

Factor  6 

-1.5 

0.6 

0.02 

Constant 

568.3 

263.5 

0.03 

Entire  model 

0.001* 

*  P-value  for  null  hypothesis  that  the  six  principal  components  do  not  improve 
prediction  of  cancer/noncancer  status. 


^  Sensitivity  =  83%  (15/18  cancer  patients  correctly  classified);  specificity  =  83%s 
(24/29  noncancer  patients  correctly  classified).  Based  on  logistic  regression  model 
using  predicted  probability  p>0.4  to  classify  patients  into  cancer  group,  used  to  define 
how  the  sensitivity  was  calculated. 
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The  power  of  the  predictive  model  was  demonstrated  by  an  independent  test.  The  MNT 
nearby  the  tumor  was  analyzed  to  yield  spectra  in  the  same  manner  described  earlier.  The  factor 
scores  for  these  patients  were  calculated  and  used  in  the  predictive  model  described  above.  Eight 
of  10  of  the  patients  analyzed  were  classified  as  having  cancer  (predicted  probability  >  0.4)  based 
on  the  model.  As  indicated  in  Fig.  8,  most  of  the  MNTs  were  well  above  the  0.4  cut-point  for 
predicted  probability  of  cancer  (one  of  the  MNT  specimens  was  from  a  patient  who  also  provided 
a  cancer  specimen  used  to  develop  the  predictive  model;  both  specimens  were  classified  as 
cancer  by  the  model).  In  a  practical  setting,  such  models  would  have  to  be  developed  using  a 
larger  sample  size  and  then  tested  on  a  new  panel  of  patients  whose  specimens  were  not  used  for 
development  of  the  model. 
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Figure  8.  Predicted  probability  of  cancer  status  using  a  logistic  regression  model  based  on  the 
first  six  factors  form  factor  analysis. 
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d.  Grouping  of  patients  without  cancer 

To  achieve  a  spread  of  groups,  the  DNA  from  the  29  reduction  mammoplasty  patients 
was  grouped  based  on  distance-C  (Fig.  9),  while  blinded  to  the  specific  spectral  profiles.  The 
three  groups  were  defined  by  cut-points  in  distance-C  (Group  1  =  0.60  -  0.72;  Group  2  =  0.77  - 
0.89;  and  Group  3  =  0.90  -  0.96).  The  mean  values  of  the  logjo  ratio  of  concentrations  of  Fapy- 
A/8-OH-Ade  were  0.43,  0.73,  and  0.087,  respectively,  which  clearly  reflects  a  relative  increase  in 
the  OH  adduct  in  Group  3 .  The  mean  value  for  the  cancer  group  was  -0.31.  Thus,  changes  in  the 
logio(Fapy-A/8-OH-Ade)  ratio  among  the  reduction  mammoplasty  patients  closely  reflected 
changes  seen  in  the  FT-IR  spectral  differences. 


Figure  9.  Groups  defined  by  distance  from  cancer  libraryifar  (Group  1,  distance-C  =  0.60-0.72), 
intermediate  (Group  2,  distance-C  =  0.77-0.89),  and  close  (Group  3,  distance-C  =  0.90-0.96), 
based  on  distance-C. 
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Major  changes  toward  a  cancer  phenotype  were  evident  when  comparing  groups  1,  2  and 
3  relative  to  the  cancer  profile  (Fig.  10).  Group  3,  which  most  closely  matched  the  cancer  DNA 
profile  was  designated  as  the  cancer-like  phenotype  and  comprised  59%  of  the  total  patient 
database  without  cancer. 


Figure  10.  Mean  spectra  for  patients 
with  cancer  and  for  groups  of 
patients  without  cancer  at  three 
distances  from  the  cancer  library:  far 
(Group  1,  intermediate  (Group  2), 
and  close  (Group)  3,  based  on 
distance-C.  The  mean  cancer 
spectrum  (dashed  line)  is 
superimposed  on  the  spectrum  of 
each  group. 
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Beginning  with  Group  1,  progressive  absorbance  increases  were  evident  in  the  area  1700- 
1500  cm'^  which  is  associated  with  CO  stretching  and  NH2  bending  vibrations  in  DNA  (19, 41). 
The  area  between  1550  and  1300  cm'*  which  is  associated  with  NH  vibrations  and  CH  in-plane 
base  deformations  showed  absorbance  increases  at  -1450  cm'*.  These  spectral  alterations  likely 
were  associated  with  the  •OH-induced  base  modifications  previously  demonstrated  in  the  breast 
by  GC -MS/SIM  (9).  The  band  at  -1230  cm'*  which  is  assigned  to  the  PO2  antisymmetric 
stretching  vibrations  of  the  phosphodiester  backbone  (19)  progressively  developed  from  a 
relatively  small  peak  in  Group  1  to  a  relatively  large  peak  in  Group  3.  As  the  -1230  cm'*  peak 
progressively  increased,  the  shoulder  in  Group  1  at  -  1 150  cm'*  became  less  prominent  and 
showed  a  slight  shift  to  a  lower  wave  number  (-  1110  cm'*)  in  Group  3.  As  shown  in  Fig.  10, 
the  area  1200-1100  cm-1  progressively  changed  between  groups  until  a  close  match  was  obtained 
between  the  Group  3  spectrum  and  the  cancer  spectrum.  The  area  1200-1100  cm'*  was  found  to 
have  the  most  statistically  significant  difference  between  the  cancer  and  noncancer  spectra  and  in 
the  systematic  shift  of  spectra  with  distance  from  the  cancer  library  as  shown  in  the  P  values  of 
Fig.  5  and  correlation  values  of  Fig.  6.  The  -1 100-700  cm'*  region  exhibited  several  band  shifts 
and  absorbance  changes,  notably  near  875  cm'*,  which  were  assigned  to  PO2.  and  PO-stretching 
vibrations  of  the  phosphodiester  group  and  CO-stretching  vibrations  of  the  deoxyribose  moiety 
(19).  The  875  cm'*  area  underwent  a  substantial  absorbance  decrease  from  Group  1  to  Group  3. 
An  absorbance  decrease  at  -975  cm'*  was  also  apparent  as  the  spectral  profiles  progress  toward 
the  caneer  phenotype. 
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e.  Results  of  factor  analysis 

Standard  factor  analysis  methods  were  used  to  determine  if  the  variation  across  spectra 
was  related  to  cancer  versus  noncancer  DNA  status  (Figure  11).  In  factor  analysis,  which  was 
used  previously  in  IR  spectroscopy  (43, 44),  the  variation  among  spectra  (expressed  as  sums  of 
squares)  is  separated  into  a  number  of  factors,  each  of  which  explains  some  of  the  variation 
across  the  collection  of  spectra.  In  the  present  analysis,  the  first  factor  approximately  represented 
the  mean  spectrum,  and  other  factors  represented  variations  near  this  approximate  mean  in 
decreasing  order  of  importance.  Before  performing  the  factor  analysis,  the  decision  was  made  to 
keep  sufficient  factors  to  explain  at  least  90%  of  the  variation  in  spectra  beyond  the  variation 
explained  by  Factor  1 .  In  these  data,  the  first  factor  explained  97. 1%  of  the  sums  of  squares  of 
normalized  absorbances  from  1503  to  761  cm''.  Of  the  remaining  sums  of  squares.  Factors  2-6 
explained,  respectively,  38.3%,  30.8%,  13.4%,  5.5%  and  3.2%. 

--a--,  j  _  .  ]j  . o jjj  — , —  Cancer 


Factor  3 


Figure  11.  Relationship  of  cancer  and  noncancer  groups  based  on  factor  analysis  in  the  1503- 
761 -cm''  area.  Relative  absorbances  of  the  noncancer  Groups  1,2,  and  3,  defined  by  distance 
from  the  cancer  library  in  a  separate  analysis,  overlap  little.  Group  3  and  the  cancer  group 
overlap  substantially.  This  plot  shows  that  groups  that  are  coherent  relative  to  distance  also  are 
coherent  relative  to  independently  derived  factors. 
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Table  7  shows  that  Factors  1,  2,  3  and  6  differed  significantly  between  patients  with  and 
without  cancer.  The  lack  of  a  significant  difference  between  patients  with  and  without  cancer 
relative  to  Factors  4  and  5  indicates  that  some  of  the  variation  across  spectra  was  not  related  to 
cancer/noncancer  status.  The  factors  were  calculated  to  explain  any  variation  and,  in  essence, 
were  blinded  to  cancer/noncancer  status. 


Table  7.  Comparison  of  factor  scores  between  patients  with  cancer  (n=18)  and  without  cancer 
(n=29)  (first  six  factors). 


Principal 

Components 

Cancer  Mean 

(±SD) 

Noncancer 

Mean  (±SD) 

P-valne 

Factor  1 

-30.2  (±0.7) 

-29.6  (±1.2) 

0.05 

Factor  2 

-1.0  (±1.8) 

0.6  (±3.8) 

0.05* 

Factor  3 

1.1  (±2.2) 

-0.8  (±3.1) 

0.02 

Factor  4 

-0.1  (±2.0) 

0.1  (±1.9) 

0.7 

Factor  5 

-0.2  (±1.2) 

0.1  (±1.2) 

0.3 

Factor  6 

-0.3  (±0.7) 

-0.2  (±1.0) 

0.05* 

SD:  Standard  deviation. 

*  /-test  based  on  unequal  variances. 
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The  cancer  and  noncancer  groups  defined  earlier  were  also  coherent  concerning  the 
factors  that  were  derived  disregarding  cancer  or  noncancer  status  or  distance  from  the  cancer 
library.  These  factors  were  derived  simply  to  explain  variations  among  spectra.  Fig.  1 1  shows 
the  groupings  in  a  plot  of  Factor  2  versus  Factor  3,  the  two  most  important  factors  describing 
variation  after  Factor  1,  which  mainly  described  a  mean  common  to  all  spectra.  There  was  little 
overlap  among  Group  1  (most  distant  from  the  cancer  library  in  Fig.  10),  Group  2  (intermediate 
distance)  and  the  combination  of  Group  3  and  the  cancer  group.  Group  3  substantially 
overlapped  the  cancer  library.  Although  there  were  evidently  continuous  changes  defining  a 
progression  from  noncancer  to  cancer  spectra,  two  diverse  statistical  analyses  (distance-C  and 
factor  analysis)  grouped  the  patients  in  a  similar  fashion. 

ITT.  CONCT.tJSIONS 

The  results  describe  the  ability  of  GC-MS/SIM  and  FT-IR  spectroscopy  to  discriminate 
high  vs.  low  damage  in  fish  liver  DNA  and  cancer  versus  noncancer  status  based  on  differences 
in  intrinsic  chemical  properties  of  DNA  isolated  from  female  breast  tissues.  In  the  case  of  the 
breast  tissues,  the  region  of  the  IR  spectrum  associated  with  vibrational  transitions  of  structural 
substituents  of  nucleotide  bases,  deoxyribose  and  phosphodiester  moieties  defines  the  most 
strikingly  consistent,  nonrandom  differences  between  the  cancer  and  noncancer  groups.  Thus, 
the  nonrandom  variation  found  most  likely  was  due  to  inherent  chemical  properties  that  are 
characteristic  of  each  group. 


50 


DAMD17-92-2006  Final  Report 


The  structural  composition  and  integrity  of  DNA  in  living  cells  and  tissues  is  a  matter  of 
prime  importance  for  the  fidelity  of  cell  division  and  the  avoidance  of  mutagenesis. 

Consequently,  it  is  kept  under  close  scrutiny  by  repair  enzymes  (e.g.,  endonucleases)  to  eliminate 
DNA  lesions  that  may  result  in  such  changes.  In  this  regard,  structural  analysis  of  DNA  from  a 
variety  of  living  systems  defines  several  specific  types  of  oxidative  chemical  aberrations  and 
their  frequency  of  occurrence  (6-10;  14).  Generally,  two  types  of  oxidative  modifications  are 
known  to  occur  in  DNA,  those  derived  from  two-electron  oxidations  resulting  in  the  formation  of 
generally  bulky  adducts  of  the  base  structures  (1-3),  and  those  derived  from  one-electron 
oxidations  arising  from  free  radical  processes  resulting  in  a  distinct  variety  of  products,  including 
modifications  of  base  structures  and  the  deoxyribose-phosphodiester  backbone  (4).  The  nature 
and  frequency  of  adducts  derived  from  two-electron  reactions  of  DNA  bases  have  been  the 
subject  of  intense  research  for  many  years  (1-3).  Such  adducts  generally  occur  in  the  range  of  1 
in  10  to  1  in  10  normal  bases  in  a  variety  of  normal  and  neoplastic  tissues  (45).  Conversely,  the 
products  from  one-electron  oxidations,  such  as  the  ring-opening  structure  Fapy-A,  or 
hydroxylation  products,  such  as  8-OH-Ade  or  8-OH-Gua,  generally  are  much  more  abundant.  In 
the  normal  breast,  for  example,  the  ratio  of  ring-opening  structures  to  normal  bases  frequently 
represents  1:1x10  to  1:1x10  normal  bases— a  difference  of  4  to  7  orders  of  magnitude  relative 
to  the  reported  values  for  two-electron  oxidation  products  (3,  45).  Thus,  the  IR  spectral 
differences  observed  in  the  present  study  most  probably  were  controlled  substantially  by  the 
contribution  of  one-electron  oxidative  changes,  such  as  those  shown  previously  to  occur  in  the 
•OH  modification  of  breast  DNA  (9).  The  •OH  probably  arises  from  H2O2  via  the  Fe^^- 
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catalyzed  Fenton  Reaction  (4,  5).  As  previously  postulated  (9),  the  H2O2  may  be  formed  from 
the  redox  cycling  of  an  effector  molecule,  such  as  an  estrogen  derivative  or  xenobiotic  chemicals 
(e.g.,  organochlorines)  arising  from  environmental  exposure.  In  this  context,  the  degree  of 
damage  may  be  related,  at  least  partly,  to  the  extent  to  which  normal  radical  trapping  systems  in 
the  cell  (e.g.,  reduced  glutathione)  are  overwhelmed  by  the  generation  of  the  *011  (7,  46). 

The  spectral  profiles  obtained  in  the  present  study  appear  to  reflect  considerable 
differences  in  the  modification  of  the  nucleotide  bases,  phosphodiester  groups  and  the 
deoxyribose  moiety  between  cancer  and  noncancer  groups  relative  to  the  attack  of  the  •OH  on 
DNA.  However,  other  reactions  cannot  be  ruled  out.  Difficulties  presently  exist  in  precisely 
defining  the  proportions  of  damage  inflicted  on  any  of  these  structural  components  or  to 
specifically  identify  the  exact  nature  of  the  functional  groups  involved.  A  clearer  understanding 
of  these  important  issues  will  require  detailed  study  in  the  future  using,  for  example, 
oligonucleotides  containing  known  types  and  amounts  of  base  lesions.  Nevertheless,  it  has  been 
shown  that  the  "OH  attacks  the  deoxyribose  moiety  to  produce  a  variety  of  products  resulting 
from  hydrogen  abstractions  of  the  pentose  ring  (47).  These  reactions  result  in  ring-opening 
products  and  the  formation  of  a  carbon  centered  radical  at  the  5 'carbon  position,  linking 
deoxyribose  residues  with  the  phosphodiester  groups.  The  attack  of  the  "OH  on  the  deoxyribose 
moiety  is  knovm  to  result  ultimately  in  strand  breakage  (47).  In  this  regard,  the  damage  inflicted 
on  the  DNA  from  the  normal  and  cancer  tissue  was  broadly  different,  as  suggested  by  melting 
studies  conducted  in  our  laboratory  which  showed  that  a  high  proportion  of  the  DNA  from  the 
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cancer  breast  exhibits  a  series  of  sub-melts  arising  from  a  significant  degree  of  strand  breakage. 
In  contrast,  the  submelts  were  not  evident  in  the  DNA  from  the  normal  breast  (unpublished  data). 

It  is  considerably  significant  that  FT-IR  analysis  defined  a  nonrandom  progression  of 
changes  in  the  spectral  properties  of  DNA  from  individuals  within  the  normal  group,  culminating 
in  a  spectrum  closely  resembling  that  of  the  mean  spectrum  of  the  cancer  DNA.  In  fact,  this 
progression  of  spectral  changes  measured,  for  example,  by  distance-C  was  found  to  vary  directly 
with  measures  from  a  statistical  model  developed  previously  (9),  based  on  logio(Fapy-A/8-OH- 
Ade),  which  compared  products  from  the  one-electron  oxidation  of  adenine.  The  structural 
alterations  measured  by  IR  spectroscopy  thus  are  parallel  to  the  redox-coupled  conversions  of 
putatively  non-mutagenic  Fapy  derivatives  to  mutagenic  8-OH  adducts  in  breast  tissues,  as 
defined  previously  (9).  Consequently,  each  of  these  analytical  methods  likely  measures  a 
progressive,  premalignant  condition  among  women  with  histologically  normal  breast  tissue  as 
their  IR  spectra  and  base  lesion  ratios  approximate  those  of  the  DNA  from  the  cancer  tissues. 

We  suggest  that  once  such  a  premalignant  condition  exists,  there  is  a  substantial  steady-state 
concentration  of  mutagenic  DNA  lesions  maintained  by  a  balance  between  oxidative  attacks  and 
DNA  repair  processes  (9).  Then,  time-dependent  accumulations  of  mutagenic  changes  may 
occur  in  the  DNA.  This  multistage  process  ultimately  gives  rise  to  conditions  that  are  the 
necessary  and  sufficient  elements  needed  for  malignant  conversion  and  tumorigenesis.  This 
perspective  is  consistent  with  recent  results  described  by  Frenkel  et  al.  (13)  indicating  the 
potential  of  circulating  levels  of  autoantibodies  reactive  with  5'-hydroxymethyl-2'-deoxyuradine 
to  be  predictive  of  future  incidence  of  breast  cancer  among  women.  These  autoantibodies  are 
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presumably  elicited  by  5'-hydroxymethyl-2'-deoxyuradine  derived  from  an  oxidative  attack  on 
breast  DNA. 

In  the  past  25  years,  the  risk  of  female  breast  cancer  has  substantially  increased  to  the 
current  11%  lifetime  incidence  rate.  This  likely  is  associated  with  increased  exposures  to 
materials  capable  of  eliciting  a  mutagenic  response  and/or  dietary  deficiencies  in  protecting 
against  oxidative  damage.  Given  that  carcinogenesis  is  a  chronic  process  driven  by  an 
accumulation  of  random  mutational  events,  that  often  develop  over  a  number  of  years,  it  is 
expected  that  a  relatively  high  proportion  of  the  population  shows  evidence  of  premalignant 
changes  in  the  DNA,  as  found  in  this  report.  However,  it  is  expected  that  only  a  portion  of  those 
individuals,  depending  on  their  specific  genetic  susceptibilities  and  DNA  lesion  profiles,  would 
achieve  the  necessary  conditions  leading  to  malignant  conversion  and  tumorigenesis.  Therefore, 
populations  residing  in  areas  with  variable  incidences  of  breast  cancer  likely  have  proportionate 
differences  in  the  percentage  of  individuals  having  premalignant  DNA  changes  of  the  type 
described. 

Our  results  predict  that  future  incidences  of  cancer  would  arise  preferentially  among  the 
group  of  normal  individuals  whose  IR  spectra  most  closely  resemble  that  of  the  cancer  DNA 
(Group  3;  Fig.  10).  One  way  to  test  this  hypothesis  would  be  to  conduct  a  spectral  analysis  of  the 
DNA  from  the  MNT  obtained  from  nearby  the  IDC.  The  noncancer  tissue  from  the  ipsilateral 
breast  has  been  shovm  to  have  a  high  incidence  of  recurring  primary  carcinoma  (48).  A 
comparison  of  the'  DNA  from  the  MNT  with  the  established  predictive  model  showed  that  8  of 
10  tissues  analyzed  were  classified  as  cancer.  That  is,  a  substantial,  distinct  group  of  individuals 
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known  to  be  at  heightened  risk  for  recurrent  breast  cancer  exhibited  a  pronounced  risk  on  the 
basis  of  the  DNA  model,  although  the  tissues  were  classified  normal  on  the  basis  of  a  routine 
microscopic  examination.  Our  previous  findings  (9)  showing  that  the  logjo  DNA  base 
concentrations  and  ratios  between  the  IDC  and  the  MNT  were  not  statistically  different  are 
completely  consistent  with  these  spectral  results  with  the  DNA. 

The  structural  modifications  in  the  breast  DNA  represent  a  premalignant  state  almost 
certainly  characterized  by  various  degrees  of  genetic  instability,  which  was  also  noted  previously 
from  the  base-model  data  (9).  This  is  potentially  important  in  that  it  has  been  shown  that 
oxidants,  such  as  the  •OH,  participate  in  the  activation  of  protooncogenes  and  the  inactivation  of 
tumor  suppressor  genes  resulting,  for  example,  in  an  increase  in  the  mutagenesis  of  hot-spot 
codons  of  the  human  p-53  gene  (49).  Given  the  ability  of  •OH-induced  base  lesions  to  create 
mutagenic  events,  it  is  clear  that  this  process  associated  intimately  with  carcinogenesis  (9, 14). 
The  demonstrated  impact  of  the  •OH  on  DNA  indicates  that  familial  susceptibility  to  cancer, 
defined  by  cancer-associated  genes  (50),  most  probably  is  influenced  by  the  type  and  degree  of 
DNA  damage  demonstrated  in  the  present  study. 

The  extension  of  the  progressive  spectral  changes  observed  in  the  normal  breast  to 
overlap  those  of  the  cancer  group  (Fig.  10),  which  presently  is  difficult  to  understand  fully,  may 
suggest  that  cancer  risk  is  likely  a  continuum  relating  to  the  various  types  and  degrees  of  DNA 
modification.  In  this  context,  the  possibility  exists  that  the  changes  within  the  cancer  group  were 
related  significantly  to  the  known  constitutive  propensity  of  breast  cells  to  generate  high 
concentrations  of  H2O2,  the  apparent  precursor  of  •OH  (51). 
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A  major  research  emphasis  on  the  observed  damage  inflicted  on  DNA  seems  appropriate 
in  the  future,  as  well  as  studies  directed  toward  reducing  radical  concentrations  in  normal  breast 
tissues  so  that  the  DNA  repair  systems  (e.g.,  the  endonucleases)  are  better  able  to  control  or 
reverse  the  damage.  From  the  clinical  perspective,  this  may  well  be  accomplished  by  increasing 
the  intake  of  antioxidants  and  the  use  of  drugs  containing  antioxidant  functional  groups  that 
preferentially  target  breast  cells.  Overall,  the  findings  lend  support  to  the  concept  that  the 
cellular  redox  status  (9)  and  •OH  concentrations  play  a  pivotal  role  in  cancer  development  in  the 
female  breast,  as  previously  postulated  (9).  These  conditions  may  well  be  reversible  and,  thus, 
should  be  given  special  consideration  in  efforts  to  reduce  the  incidence  of  breast  cancer. 

Aside  from  providing  a  new  perspective  of  the  etiology  of  breast  cancer,  it  is  apparent 
that  the  FT-IR  analyses  afford  a  promising  means  for  assessing  the  DNA  status  relative  to  the 
risk  of  developing  breast  cancer.  This  conclusion  is  exemplified  by  using  the  predictive  model 
for  cancer/  noncancer  status  shown  in  Table  6  to  derive  risk  scores,  such  as  those  given  in  Fig.  8. 

The  FT-IR  analysis  is  rapid  and  requires  minimal  (|ag)  amounts  of  tissue,  which  are  likely 
obtainable  via  fine  needle  biopsy  procedures.  Results  of  such  an  analysis  will  help  in  defining 
groups  of  individuals  of  low  to  high  potential  risk  at  the  earliest  stages  of  oncogenesis  when 
therapeutic  intervention  would  be  especially  effective. 

In  conclusion,  59%  of  normal  women  studied  from  the  Puget  Sound,  Washington  area 
were  shown  to  have  a  DNA  phenotype  in  the  breast  representing  a  premalignant  state  and  thus 
potentially  placing  them  at  high  risk  for  developing  breast  cancer.  In  this  respect,  the  degree  of 
cancer  risk,  which  presumably  diminishes  as  the  spectral  profiles  increasingly  mismatch  those  of 
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the  cancer  profile,  can  be  calculated  on  the  basis  of  probability  models,  such  as  the  one  used  in 
this  study  (Fig.  8). 

The  proportion  of  normal  women  with  the  cancer-like  phenotype  is  disturbingly  high. 
However,  therapeutic  approaches  that  stabilize  the  redox  conditions  in  breast  cells,  thus  reversing 
oxidative  fluxes,  may  substantially  decrease  the  cancer  risk,  as  would  the  successful  delivery  of 
sufficient  antioxidant  or  reductant  compounds  to  the  breast  that  would  counteract  the  damaging 
effects  of  the  •OH.  In  a  broader  sense,  the  GC-MS/SIM  and  FT-IR  spectral  models  also  maybe 
widely  useful  in  predicting  risk  for  other  types  of  cancer. 

Further  studies  are  necessary  to  determine  the  degree  to  which  the  premalignant  state 
occurs  in  the  breast  DNA  of  women  from  diverse  geographical  areas,  such  as  relative  to  various 
risk  factors  [e.g.,  chemical  exposures  implicated  recently  in  breast  carcinogenesis  (52)].  The 
present  findings  may  at  least  explain  partly  the  high  occurrence  rate  of  breast  cancer  among 
women  and  form  the  basis  for  an  important  new  paradigm  for  the  prediction,  early  intervention 
and  treatment  of  this  disease.  Considering  the  severity  of  this  problem,  there  is  a  critical  need  for 
a  controlled  prospective  study  conducted  on  the  basis  of  the  present  and  previous  (9)  findings. 
This  should  be  directed  toward  testing  the  association  between  the  analytical  parameters 
described  and  various  epidemiological  factors  related  to  the  occurrence  of  the  disease. 
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TV.  FUTURE  PLANS 

The  findings  described  provide  a  firm  basis  for  a  prospective  study  testing  the  hypothesis 
that  the  nonrandom  progression  of  oxidative  modifications  in  the  normal  female  breast  allow  for 
the  formulation  of  statistical  models  that  are  predictive  of  breast  cancer  in  the  population.  In 
such  a  study,  attention  should  focus  on  understanding  temporal  changes  in  the  DNA  status  of 
women  in  relation  to  breast  cancer  risk.  In  addition,  the  data  on  the  IDC  evoke  the  question  of 
the  nature  of  statistical  DNA  models  derived  from  other  cancer-related  breast  disease,  such  as 
atypical  hyperplasia.  We  believe  that  future  work  in  this  direction  should  be  considered. 

Overall,  the  data  support  the  conclusion  that  the  DNA-related  changes  leading  to  carcinogenesis 
are  phylogenetically  conserved  and  thus  can  be  exploited  in  DNA  modeling  related  to  a  wide 
variety  of  hormone-  and  xenobiotic-induced  cancers  (e.g.,  those  associated  with  environmental 
contaminants). 
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